TAUR-dev/D-EVAL__standard_eval_v3__sft1e-6_ppo_countdown3arg_format0.3_transition0.3-rl_eval-eval_sft
Viewer
• Updated • 2.45k • 5
TAUR-dev/D-EVAL__standard_eval_v3__sft1e-6_ppo_countdown3arg_format0.1-rl_eval-eval_sft
Viewer
• Updated • 2.45k • 5
TAUR-dev/D-EVAL__standard_eval_v3__sft1e-5_ppo_countdown3arg-rl_eval-eval_sft
Viewer
• Updated • 2.45k • 5
TAUR-dev/D-EVAL__standard_eval_v3__sft1e-6_ppo_countdown3arg-rl_eval-eval_sft
Viewer
• Updated • 2.45k • 4
TAUR-dev/D-EVAL__standard_eval_v3__skills_in_rl_v2__1e6_cd3arg_sft-sft_eval-eval_sft
Viewer
• Updated • 4.9k • 5
TAUR-dev/D-EVAL__standard_eval_v3__sft1e-5_ppo_countdown3arg_format0.3-rl_eval-eval_sft
Viewer
• Updated • 2.45k • 5
TAUR-dev/D-EVAL__standard_eval_v3__sft1e-6_ppo_countdown3arg_transition0.3-rl_eval-eval_sft
Viewer
• Updated • 2.45k • 6
TAUR-dev/D-EVAL__standard_eval_v3__sft1e-6_ppo_countdown3arg_format0.1_transition0.1-rl_eval-eval_sft
Viewer
• Updated • 4.9k • 6
TAUR-dev/D-EVAL__standard_eval_v3__skills_in_rl_v2__1e5_all_tasks_sft-sft_eval-eval_sft
Viewer
• Updated • 4.9k • 5
TAUR-dev/D-EVAL__standard_eval_v3__sft1e-5_ppo_countdown3arg_format0.1-rl_eval-eval_sft
Viewer
• Updated • 2.45k • 5
TAUR-dev/D-EVAL__standard_eval_v3__skills_in_rl_v2__1e6_all_tasks_sft-sft_eval-eval_sft
Viewer
• Updated • 2.45k • 4
TAUR-dev/D-EVAL__standard_eval_v3__skills_in_rl_v2__1e5_cd3arg_sft-sft_eval-eval_sft
Viewer
• Updated • 4.9k • 5
TAUR-dev/D-EVAL__standard_eval_v3__skills_in_rl-ppo_only_baseline_cd3arg_only-rl_eval-eval_rl
Viewer
• Updated • 2.45k • 3
TAUR-dev/D-EVAL__standard_eval_v3__skills_in_rl_v2__1e5_cd3arg_sft-eval_sft
Viewer
• Updated • 2.45k • 5
TAUR-dev/D-EVAL__standard_eval_v3__skills_in_rl_v2__1e6_cd3arg_sft-eval_sft
Viewer
• Updated • 2.45k • 5
TAUR-dev/D-SFT_C-skills_in_rl_v2__1e6_all_tasks_sft-sft-data
Viewer
• Updated • 39k • 5
TAUR-dev/D-SFT_C-skills_in_rl_v2__1e5_all_tasks_sft-sft-data
Viewer
• Updated • 39k • 5
TAUR-dev/D-SFT_C-skills_in_rl_v2__1e6_cd3arg_sft-sft-data
Viewer
• Updated • 7.62k • 5
TAUR-dev/D-SFT_C-skills_in_rl_v2__1e5_cd3arg_sft-sft-data
Viewer
• Updated • 7.62k • 5
TAUR-dev/checking_metric_tracking
Viewer
• Updated • 10 • 6
TAUR-dev/D-SFT-dataset__commonsenseQA__tagged_bon_v2
Viewer
• Updated • 13.7k • 5
TAUR-dev/D-SFT-dataset__longmult_3dig__tagged_bon_v2
Viewer
• Updated • 7.28k • 5
TAUR-dev/D-SFT-dataset__gsm8k__tagged_bon_v2
Viewer
• Updated • 10.4k • 5
TAUR-dev/D-SFT-dataset__countdown_3arg__tagged_bon_v2
Viewer
• Updated • 7.62k • 4
TAUR-dev/D-EVAL__standard_eval_v3__sft_annotation_for_csqa_v2-eval_0
Viewer
• Updated • 8.74k • 5
TAUR-dev/D-EVAL__standard_eval_v3__sft_annotation_for_cd3arg_v2-eval_0
Viewer
• Updated • 4k • 6
TAUR-dev/D-EVAL__standard_eval_v3__sft_annotation_for_gsm8k_v2-eval_0
Viewer
• Updated • 6.47k • 6
TAUR-dev/D-EVAL__standard_eval_v3__eval_base_model_new_prompts_8_13_25-eval_0
Viewer
• Updated • 2.45k • 5
TAUR-dev/D-EVAL__standard_eval_v3__sft_annotation_for_longmult3arg_v2-eval_0
Viewer
• Updated • 4k • 5
TAUR-dev/zaynes_dataset__longmult_3dig__best_of_n_v2
Viewer
• Updated • 4k • 5