TAUR-dev/D-EVAL__standard_eval_v3__budget_forcing_test_run_ct3arg-eval_rl
Viewer
• Updated
• 250 • 6
TAUR-dev/D-EVAL__standard_eval_v3__eval_base_bon-eval_base
Viewer
• Updated
• 4.45k • 7
TAUR-dev/D-SFT_C-sft_exp_1e_zayneprompts_v2_orig_only2e-sft-data
Viewer
• Updated
• 1.32k • 6
TAUR-dev/sf_1e_orig_pv__uniq_corr_num_correct_1.1.2.2.3.2.3.4.3.4.5_num_incorrect_0.1.0.1.0.2.1
Viewer
• Updated
• 1.32k • 6
TAUR-dev/D-SFT_C-sft_exp_1e_zayneprompts_v2-sft-data
Viewer
• Updated
• 1.83k • 6
TAUR-dev/sf_1e_orig_pae_alt_rep_pv__uniq_corr_num_correct_1.1.2.2.3.2.3.4.3.4.5_num_incorrect_0
Viewer
• Updated
• 6.32k • 6
TAUR-dev/sf_1e_orig_pae_alt_rep_pv_num_correct_1.1.2.2.3.2.3.4.3.4.5_num_incorrect_0.1.0.1.0.2
Viewer
• Updated
• 6.32k • 6
TAUR-dev/D-EVAL__standard_eval_v3__M-0910__qrepeat3_ref3_3args_grpo-rl-eval_rl
Viewer
• Updated
• 4.45k • 6
TAUR-dev/D-EVAL__standard_eval_v3__budget_forcing_test_run-eval_rl
Viewer
• Updated
• 5 • 5
TAUR-dev/D-EVAL__standard_eval_v3__eval_base-eval_base
Viewer
• Updated
• 4.45k • 7
Viewer
• Updated
• 250 • 5
TAUR-dev/D-EVAL__standard_eval_v3__eval_rl_only_3args-eval_rl
Viewer
• Updated
• 4.45k • 7
TAUR-dev/D-EVAL__standard_eval_v3__eval_pv_repreat3_sft-_eval_sft
Viewer
• Updated
• 8.9k • 11
TAUR-dev/D-EVAL__standard_eval_v3__eval_1c_sft-_eval_sft
Viewer
• Updated
• 4.45k • 6
TAUR-dev/D-EVAL__standard_eval_v3__eval_1b_sft-_eval_sft
Viewer
• Updated
• 4.45k • 6
TAUR-dev/D-EVAL__standard_eval_v3__eval_1e_sft-_eval_sft
Viewer
• Updated
• 4.45k • 6
TAUR-dev/D-EVAL__standard_eval_v3__eval_greedy_M-sft_exp_zayne-sft-eval_sft
Viewer
• Updated
• 2.45k • 6
TAUR-dev/D-SFT_C-sft_exp_1e_zayneprompts-sft-data
Viewer
• Updated
• 24k • 5
TAUR-dev/skillfactory_yolo_1e_zayne_prompts_num_correct_1.1.2.2.3.2.3.4.3.4.5_num_incorrect_0.1
Viewer
• Updated
• 24k • 6
TAUR-dev/D-SFT_C-sft_exp_zayneV3_cd3arg_w_gpt4o_both-sft-data
Viewer
• Updated
• 35.1k • 6
TAUR-dev/D-SFT_C-sft_exp_zayneV3_cd3arg_w_gpt4o_ref-sft-data
Viewer
• Updated
• 32.8k • 7
TAUR-dev/D-SFT_C-sft_exp_1e_gpt4o_both-sft-data
Viewer
• Updated
• 35.1k • 6
TAUR-dev/skillfactory_yolo_1e_with_gtp4o_reflections_num_correct_1.1.2.2.3.2.3.4.3.4.5_num_inco
Viewer
• Updated
• 32.8k • 4
TAUR-dev/skillfactory_yolo_1e_with_gtp4o_both_num_correct_1.1.2.2.3.2.3.4.3.4.5_num_incorrect_0
Viewer
• Updated
• 35.1k • 6
TAUR-dev/sft_data_mix_9_13_25_3argsO_zaynev3_w_gpt4o_both
Viewer
• Updated
• 44.5k • 6
TAUR-dev/9_8_25__countdown_3arg__sft_data_GPT4o_multiprompts_gpt4o_reflections
Viewer
• Updated
• 157k • 6
TAUR-dev/testing_dataset_acronym_4o_nd
Viewer
• Updated
• 2.59k • 6
TAUR-dev/testing_dataset_acronym_5o_nd
Viewer
• Updated
• 2.12k • 6
TAUR-dev/testing_dataset_acronym_4o_d
Viewer
• Updated
• 2.59k • 6
TAUR-dev/testing_dataset_acronym_5o_d
Viewer
• Updated
• 2.12k • 6