TAUR-dev/D-EVAL__standard_eval_v3__jack_test_workflow-eval
Viewer
• Updated • 1.25k • 4
TAUR-dev/zaynes_dataset__gsm8k__best_of_n__scored
Viewer
• Updated • 6.47k • 4
TAUR-dev/zaynes_dataset__gsm8k__best_of_n
Viewer
• Updated • 6.47k • 5
TAUR-dev/zaynes_dataset__commonsenseQA__best_of_n__scored
Viewer
• Updated • 8.74k • 4
TAUR-dev/zaynes_dataset__commonsenseQA__best_of_n
Viewer
• Updated • 8.74k • 4
TAUR-dev/dataset__gsm8k__best_of_n__scored
Viewer
• Updated • 6.47k • 4
TAUR-dev/dataset__gsm8k__best_of_n
Viewer
• Updated • 6.47k • 5
TAUR-dev/dataset__commonsenseQA__best_of_n__scored
Viewer
• Updated • 8.74k • 4
TAUR-dev/dataset__commonsenseQA__best_of_n
Viewer
• Updated • 8.74k • 4
TAUR-dev/D-EVAL__standard_eval_v3__hc_search_grpo_n_32__comp_with_sft-eval_rl
Viewer
• Updated • 4.9k • 5
TAUR-dev/D-EVAL__standard_eval_v3__hc_search_grpo_n_32__comp_with_sft-eval_sft
Viewer
• Updated • 7.35k • 6
TAUR-dev/D-EVAL__standard_eval_v3__test_all_parts__sbatch-eval_0
Viewer
• Updated • 1.25k • 4
TAUR-dev/D-EVAL__standard_eval_v3__eval_checkpoints_test-eval_sft
Viewer
• Updated • 500 • 7
TAUR-dev/D-EVAL__standard_eval_v3__test_all_parts__sbatch-eval_rl
Viewer
• Updated • 250 • 5
TAUR-dev/D-EVAL__standard_eval_v3__hardcoded_search_grpo_n_32-rl-eval_rl
Viewer
• Updated • 2.45k • 5
TAUR-dev/D-EVAL__standard_eval_v3__hardcoded_search_function__low_lr_sft5epochs-eval_rl
Viewer
• Updated • 7.35k • 4
TAUR-dev/D-EVAL__standard_eval_v3__hardcoded_search_function__low_lr_sft5epochs-eval_sft
Viewer
• Updated • 2.45k • 5
TAUR-dev/D-SFT_C-hardcoded_search_function__low_lr_sft5epochs-sft-data
Viewer
• Updated • 4k • 5
TAUR-dev/D-EVAL__standard_eval_v3__hardcoded_search_function-eval_sft
Viewer
• Updated • 6.15k • 4
TAUR-dev/D-EVAL__standard_eval_v3__hardcoded_search_function__low_lr-eval_sft
Viewer
• Updated • 3.7k • 5
TAUR-dev/D-EVAL__standard_eval_v3__test_metrics_skill_analysis-eval_sft
Viewer
• Updated • 1.5k • 4
TAUR-dev/D-SFT_C-hardcoded_search_function__low_lr-sft-data
Viewer
• Updated • 4k • 5
TAUR-dev/D-SFT_C-hardcoded_search_function-sft-data
Viewer
• Updated • 4k • 6
TAUR-dev/D-SFT_C-cd3arg-Qwen2.5-1.5B-handcrafted_search
Viewer
• Updated • 4k • 4
TAUR-dev/D-EVAL__standard_eval_v3__back_to_og_mix__simple_mix__rl_eval-eval_rl
Viewer
• Updated • 2.45k • 5
TAUR-dev/D-EVAL__standard_eval_v3__test_checkpoint_metrics-eval_sft
Viewer
• Updated • 2.75k • 4
TAUR-dev/D-EVAL__standard_eval_v3__test_all_parts-eval_0
Viewer
• Updated • 1.25k • 5
TAUR-dev/D-EVAL__standard_eval_v3__back_to_og_mix__simple_retries__sbon-eval_sft
Viewer
• Updated • 2.45k • 5
TAUR-dev/D-EVAL__standard_eval_v3__bon-eval_sft
Viewer
• Updated • 2.45k • 5
TAUR-dev/D-SFT_C-back_to_og_mix__simple_retries__sbon-sft-data
Viewer
• Updated • 7.62k • 5