TAUR-dev/reflection_countdown_4args_3
Viewer
• Updated • 4k • 3
TAUR-dev/reflection_countdown_4args_2
Viewer
• Updated • 4k • 4
TAUR-dev/reflection_countdown_4args_1
Viewer
• Updated • 4k • 3
TAUR-dev/reflection_countdown_4args_0
Viewer
• Updated • 4k • 4
TAUR-dev/reflection_countdown_4args_8
Viewer
• Updated • 4k • 8
TAUR-dev/D-EVAL__standard_eval_v3__qwen25_15b_instruct_bestofn_atags_countdown_3arg-eval_0
Viewer
• Updated • 4k • 2
TAUR-dev/D-EVAL__standard_eval_v3__qwen25_15b_instruct_bestofn_atags_countdown_4arg-eval_0
Viewer
• Updated • 4k • 2
TAUR-dev/D-EVAL__standard_eval_v3__qwen25_15b_instruct_bestofn_atags-eval_0
Viewer
• Updated • 500 • 2
TAUR-dev/D-EVAL__standard_eval_v3__GRPO_basemodel_rl_grpo-ckpt-25-rl-eval_rl
Viewer
• Updated • 2.45k • 4
TAUR-dev/D-EVAL__standard_eval_v3__GRPO_basemodel_rl_grpo-ckpt-50-rl-eval_rl
Viewer
• Updated • 2.45k • 2
TAUR-dev/D-EVAL__standard_eval_v3__SBON_adv_rwd_grpo__adv_rwds-rl_checkpoint-step-50-eval_rl
Viewer
• Updated • 2.45k • 6
TAUR-dev/D-EVAL__standard_eval_v3__SBON_alltasks_lowlr_rl_grpo-ckpt-50-rl-eval_rl
Viewer
• Updated • 2.45k • 7
TAUR-dev/D-EVAL__standard_eval_v3__VOTING_adv_rwd_grpo__adv_rwds-rl_checkpoint-step-50-eval_rl
Viewer
• Updated • 2.45k • 3
TAUR-dev/D-EVAL__standard_eval_v3__VOTING_setup1_rl_grpo-ckpt-50-rl-eval_rl
Viewer
• Updated • 2.45k • 4
TAUR-dev/D-EVAL__standard_eval_v3__VOTING_setup1_rl_grpo-ckpt-25-rl-eval_rl
Viewer
• Updated • 2.45k • 2
TAUR-dev/D-EVAL__standard_eval_v3__SBON_adv_rwd_grpo__adv_rwds-rl_checkpoint-step-40-eval_rl
Viewer
• Updated • 2.45k • 2
TAUR-dev/D-EVAL__standard_eval_v3__ppo_only_baseline_all_tasks-rl_8k_tok_eval-eval_rl
Viewer
• Updated • 2.45k • 4
TAUR-dev/D-EVAL__standard_eval_v3__GRPO_basemodel_rl_grpo-rl_8k_tok_eval-eval_rl
Viewer
• Updated • 2.45k • 4
TAUR-dev/D-EVAL__standard_eval_v3__VOTING_adv_rwd_grpo__adv_rwds-rl_checkpoint-step-40-eval_rl
Viewer
• Updated • 2.45k • 2
TAUR-dev/D-EVAL__standard_eval_v3__SBON_alltasks_lowlr_rl_grpo-rl_8k_tok_eval-eval_rl
Viewer
• Updated • 2.45k • 2
TAUR-dev/D-EVAL__standard_eval_v3__VOTING_setup1_rl_grpo-rl_8k_tok_eval-eval_rl
Viewer
• Updated • 2.45k • 2
TAUR-dev/D-EVAL__standard_eval_v3__VOTING_setup1_1epch_1e6_all_tasks_8k_tok_eval-eval_rl
Viewer
• Updated • 2.45k • 2
TAUR-dev/D-EVAL__standard_eval_v3__ppo_only_baseline_all_tasks-rl_eval-eval_rl
Viewer
• Updated • 4.9k • 6
TAUR-dev/D-EVAL__standard_eval_v3__GRPO_basemodel_rl_grpo-rl_eval-eval_rl
Viewer
• Updated • 2.45k • 2
TAUR-dev/D-EVAL__standard_eval_v3__SBON_alltasks_lowlr_rl_grpo-rl_eval-eval_rl
Viewer
• Updated • 2.45k • 5
TAUR-dev/D-EVAL__standard_eval_v3__VOTING_setup1_rl_grpo-rl_eval-eval_rl
Viewer
• Updated • 2.45k • 4
TAUR-dev/D-EVAL__standard_eval_v3__SBON_skills_in_rl_v2__1e6_all_tasks_eval-eval_rl
Viewer
• Updated • 2.45k • 3
TAUR-dev/D-EVAL__standard_eval_v3__VOTING_setup1_1epch_1e6_all_tasks_eval-eval_rl
Viewer
• Updated • 2.45k • 3
TAUR-dev/D-EVAL__standard_eval_v3__VOTING_setup1_1epch_1e6_all_tasks_multistructure_eval-eval_rl
Viewer
• Updated • 2.45k • 3
TAUR-dev/D-EVAL__standard_eval_v3__VOTING_setup1_1epch_1e6_all_tasks_only_inc_to_corr_eval-eval_rl
Viewer
• Updated • 2.45k • 3