rl-rag/sft-mix-v20250921_long_form_only
Viewer
• Updated • 10.3k • 52
rl-rag/sft-mix-v20250921_05
Viewer
• Updated • 8k • 34
rl-rag/sft-mix-v20250921_02
Viewer
• Updated • 3.2k • 35
rl-rag/sft-mix-v20250921_01
Viewer
• Updated • 1.6k • 33
rl-rag/sft-mix-v20250921_005
Viewer
• Updated • 800 • 30
rl-rag/rl_rag_train_sa_3k_longform_rubrics
Viewer
• Updated • 2.94k • 39
rl-rag/rl_rag_sqa_searcharena_rubrics_web_augmented_rubrics_only_call_tool
Viewer
• Updated • 2.94k • 46
rl-rag/rl_rag_sqa_searcharena_rubrics_web_augmented_rubrics_only_with_new_mcp_system_prompt
Viewer
• Updated • 2.94k • 51
rl-rag/rl_rag_sqa_searcharena_rubrics_web_augmented_longform_averaged_outcome_with_system_prompt
Viewer
• Updated • 2.94k • 45
rl-rag/rl_rag_sqa_searcharena_rubrics_web_augmented_outcome_with_new_mcp_system_prompt
Viewer
• Updated • 2.94k • 42
rl-rag/gpqa_diamond_rlvr_no_prompt
Viewer
• Updated • 198 • 64
rl-rag/nq_rlvr_no_prompt_f1_test
Viewer
• Updated • 3.61k • 36
rl-rag/tqa_rlvr_no_prompt_f1_test
Viewer
• Updated • 17.9k • 43
rl-rag/hotpotqa_rlvr_no_prompt_f1_test
Viewer
• Updated • 7.41k • 42
rl-rag/2wiki_rlvr_no_prompt_f1_test
Viewer
• Updated • 300 • 38
rl-rag/asearcher_short_form_rlvr_with_system_prompt
Viewer
• Updated • 70.6k • 57
rl-rag/verified_miro_trajectories
Viewer
• Updated • 9.88k • 91
rl-rag/rl_rag_sqa_openscholar_rubrics_s2_augmented_longform_averaged_outcome_with_system_prompt
Viewer
• Updated • 2.42k • 41
rl-rag/combined-sft-training-data-v20250824_MiroSystemPrompt
Viewer
• Updated • 4.44k • 39
Viewer
• Updated • 3.99k • 103
rl-rag/rl_rag_sqa_no_retrieval_1k_longform_finegrained_with_system_prompt
Viewer
• Updated • 999 • 42
rl-rag/rl_rag_sqa_no_retrieval_1k_longform_averaged_outcome_with_system_prompt
Viewer
• Updated • 999 • 42
rl-rag/rl_rag_no_retrieval_1k_longform_rubrics_only_with_system_prompt
Viewer
• Updated • 999 • 40
rl-rag/gpt-oss-20b-eval-react-serper
Updated • 84
rl-rag/verifiable_synthetic_1k_0814
Viewer
• Updated • 1.05k • 52
rl-rag/verifiable_synthetic_varied_depth_o3_verified
Viewer
• Updated • 101 • 60
rl-rag/verifiable_synthetic_depth_one_v2_verified
Viewer
• Updated • 114 • 71
rl-rag/combined-sft-training-data-v20250724
Viewer
• Updated • 568 • 60
rl-rag/qwq_32b_factualqa_sft_data
Viewer
• Updated • 36.5k • 98