LangAGI-Lab/qwen-7b-instruct-8k-dpo-preference-set
Viewer
• Updated
• 8.16k • 5
LangAGI-Lab/qwen-7b-verified-7k-rejection-sampling-alpaca-format
Viewer
• Updated
• 7.38k • 6
LangAGI-Lab/MetaMATH_30K_llama_ppo
Viewer
• Updated
• 30k • 6
LangAGI-Lab/train-rl-o1-mini-annotated-math-numina-22k
Viewer
• Updated
• 22k • 13
• 1
LangAGI-Lab/train-rl-o1-mini-annotated-math-numina-10k-numeric-answer
Viewer
• Updated
• 10k • 8
LangAGI-Lab/numina-cot-verifiable-10k
Viewer
• Updated
• 10k • 10
• 1
LangAGI-Lab/train-rl-o1-mini-annotated-magpie-hard-math-22k
Viewer
• Updated
• 22k • 6
LangAGI-Lab/magpie-reasoning-v1-10k-verification-alpaca-format
Viewer
• Updated
• 7.31k • 5
LangAGI-Lab/MetaMATH_30K_llama
Viewer
• Updated
• 30k • 6
LangAGI-Lab/MetaMATH_SFT_50K_new
Viewer
• Updated
• 50k • 5
LangAGI-Lab/magpie-reasoning-v1-10k-step-by-step-rationale-alpaca-format
Viewer
• Updated
• 10k • 14
• 1
LangAGI-Lab/magpie-reasoning-v1-10k-step-by-step-rationale
Viewer
• Updated
• 10k • 5
LangAGI-Lab/magpie-reasoning-v1-100k-thought-summary
Viewer
• Updated
• 96.4k • 14
• 2
LangAGI-Lab/MetaMATH_SFT_50K
Viewer
• Updated
• 50k • 6
LangAGI-Lab/general-reasoning-1k-o1-mini-api-thought-cost-w-metadata
Viewer
• Updated
• 857 • 7
• 2
LangAGI-Lab/MetaMATH_30K_new
Viewer
• Updated
• 30k • 6
Viewer
• Updated
• 30k • 6
LangAGI-Lab/math-train-7k
Viewer
• Updated
• 7.5k • 5
LangAGI-Lab/general-reasoning-1k
Viewer
• Updated
• 1k • 6
• 1
LangAGI-Lab/critic2_feedbackonly
Viewer
• Updated
• 2k • 4
LangAGI-Lab/critic1_feedbackonly
Viewer
• Updated
• 2k • 5
LangAGI-Lab/math-train-1K
Viewer
• Updated
• 985 • 6
LangAGI-Lab/med_critic2_train_1000
Viewer
• Updated
• 1.4k • 5
LangAGI-Lab/med_critic1_train_1000
Viewer
• Updated
• 1.5k • 5
LangAGI-Lab/Medical_reward_bench
Viewer
• Updated
• 2.36k • 1.42k
LangAGI-Lab/train-self-refine-dist
Viewer
• Updated
• 12.9k • 5
LangAGI-Lab/mini_rm_benchmark_for_web_agent
Viewer
• Updated
• 128 • 7
LangAGI-Lab/world_model_for_wa_desc_with_tao_dataset_with_transition_count
Viewer
• Updated
• 14.7k • 6
LangAGI-Lab/Multimodal-Mind2Web-HTML-WM-messages-filter-35000
Viewer
• Updated
• 4.34k • 12
LangAGI-Lab/Multimodal-Mind2Web-HTML-WM-messages
Viewer
• Updated
• 6.77k • 16