TAUR-dev/D-EVAL__standard_eval_v3__masking_run_1
Viewer
• Updated • 3.4k • 2
TAUR-dev/D-EVAL__standard_eval_v3__skillfactory_longmult2d_data__BON__convos_mask_nonverification_tokens
Viewer
• Updated • 3.4k • 3
TAUR-dev/D-EVAL__standard_eval_v3__SIE-mixed_1ep_sft_then_ppo-sft
Viewer
• Updated • 4.9k • 3
TAUR-dev/D-EVAL__standard_eval_v3__SIE-mixed_10ep_sft_then_ppo-sft
Viewer
• Updated • 4.9k • 4
TAUR-dev/D-EVAL__standard_eval_v3__SIE-mixed_10ep_sft_then_ppo-rl
Viewer
• Updated • 4.9k • 2
TAUR-dev/D-EVAL__standard_eval_v3__SIE-mixed_1ep_sft_then_ppo-rl
Viewer
• Updated • 4.9k • 4
TAUR-dev/D-EVAL__standard_eval_v3__mask_test_5epoch
Viewer
• Updated • 3.4k • 2
TAUR-dev/D-EVAL__standard_eval_v3__mask_test_1epoch
Viewer
• Updated • 3.4k • 2
TAUR-dev/D-EVAL__standard_eval_v3__SIE-mock_search_v2_first_attempt-sft
Viewer
• Updated • 3.4k • 3
TAUR-dev/D-EVAL__standard_eval_v3__SIE-mock_search_v2_first_attempt__nonmixed_rl-rl
Viewer
• Updated • 3.4k • 8
TAUR-dev/D-EVAL__standard_eval_v3__SIE-mock_search_v2_first_attempt__mixed_rl-rl
Viewer
• Updated • 3.4k • 9
TAUR-dev/answer_parser_gold_standard__gpt4o_annotated
Viewer
• Updated • 1.7k • 3
TAUR-dev/D-EVAL__standard_eval_v1__SIE-mock_search_v2_first_attempt__mixed_rl-rl
Viewer
• Updated • 1.7k • 6
TAUR-dev/D-EVAL__standard_eval_v1__SIE-mock_search_v2_first_attempt__nonmixed_rl-rl
Viewer
• Updated • 1.7k • 5
TAUR-dev/D-EVAL__standard_eval_v1__SIE-mock_search_v2_first_attempt-sft
Viewer
• Updated • 1.7k • 4
TAUR-dev/D-EVAL__standard_eval_v1__mask_test_1epoch
Viewer
• Updated • 1.7k • 3
TAUR-dev/D-EVAL__standard_eval_v1__mask_test_5epoch
Viewer
• Updated • 1.7k • 2
TAUR-dev/D-SFTv1_C-cd3arg-Qwen2.5-1.5B-MockSearchV2-7_24_25-extra_info
Viewer
• Updated • 29.8k • 2
TAUR-dev/testing2_standard_eval_v2
Viewer
• Updated • 300 • 2
TAUR-dev/D-EVAL__standard_eval_v2__SIE-mixed_1ep_sft_then_ppo-rl
Viewer
• Updated • 2.45k • 2
TAUR-dev/D-EVAL__standard_eval_v2__SIE-mixed_10ep_sft_then_ppo-rl
Viewer
• Updated • 2.45k • 2
TAUR-dev/D-EVAL__standard_eval_v2__SIE-mixed_10ep_sft_then_ppo-sft
Viewer
• Updated • 2.45k • 5
TAUR-dev/D-EVAL__standard_eval_v2__SIE-mixed_1ep_sft_then_ppo-sft
Viewer
• Updated • 2.45k • 2
TAUR-dev/D-EVAL__standard_eval_v2__SIE-mix_sft_5ep_1e6lr__ppo_all_tasks_5ep-rl
Viewer
• Updated • 4.9k • 2
TAUR-dev/D-EVAL__standard_eval_v1__skillfactory_longmult2d_data__BON__convos_mask_nonverification_tokens
Viewer
• Updated • 1.7k • 3
TAUR-dev/D-EVAL__standard_eval_v2__SIE-rl_only__ppo__all_tasks__5ep-rl
Viewer
• Updated • 4.9k • 2
TAUR-dev/D-EVAL__standard_eval_v2__SIE-mix_sft_5ep_1e6lr__ppo_all_tasks_1ep-rl
Viewer
• Updated • 4.9k • 3
TAUR-dev/D-EVAL__standard_eval_v2__SIE-mix_sft_5ep_1e6lr__grpo_all_tasks_5ep-rl
Viewer
• Updated • 4.9k • 2
TAUR-dev/D-EVAL__standard_eval_v2__SIE-rl_only__grpo__all_tasks__5ep-rl
Viewer
• Updated • 2.55k • 3
TAUR-dev/D-SFT_C-cd3arg-Qwen2.5-1.5B-Mixed-all_examples_with_skills
Viewer
• Updated • 15.2k • 4