TAUR-dev/M-skillfactory-ablations__random_reflections5_formatsrandom-sft
2B • Updated • 2
TAUR-dev/M-skillfactory-ablations__no_reflections_reflections5_formatsno_reflection-sft
2B • Updated • 2
TAUR-dev/M-skillfactory-ablations__orig_only_reflections5_formats-C_full-sft
2B • Updated • 2
TAUR-dev/M-RC-ab_sft_bon_corr_samples-sft
2B • Updated • 2
TAUR-dev/M-RC-ab_sft_our_structure_single_sample-sft
2B • Updated • 2
TAUR-dev/M-rl_1e_v2__pv_v3-rl
2B • Updated • 3
TAUR-dev/M-0918__0epoch_3and4args_grpo-rl
Updated
TAUR-dev/M-sft_exp_1e_zayneprompts_v3-sft
2B • Updated • 1
TAUR-dev/M-rl_1e_v2__pv_v2-rl
2B • Updated • 1
TAUR-dev/M-rl_1e_v2__pv_v2_origonly2e-rl
2B • Updated • 1
TAUR-dev/M-rl_1e_v2__pv_v2-rl__150
2B • Updated • 1
TAUR-dev/M-rl_1e_v2__pv_v2_origonly2e-rl__150
2B • Updated • 1
TAUR-dev/M-sft_exp_1e_zayneprompts_v2_orig_only2e-sft
2B • Updated • 1
TAUR-dev/M-sft_exp_1e_zayneprompts_v2-sft
2B • Updated • 1
TAUR-dev/M-0914_fastrl__1e_3args_dapo-rl
2B • Updated • 1
TAUR-dev/M-rl_1e_v2__pv-rl
2B • Updated • 3
TAUR-dev/M-0914_fastrl__0epoch_3args_dapo-rl
2B • Updated • 1
TAUR-dev/M-1e_with_gpt4o_reflections-rl
2B • Updated • 1
TAUR-dev/M-1e_with_gpt4o_both-rl
2B • Updated • 3
TAUR-dev/M-0914_fastrl__0epoch_3args_grpo_notokenmean-rl
Updated
TAUR-dev/M-sft_exp_1e_zayneprompts-sft
2B • Updated TAUR-dev/M-sft_exp_zayneV3_cd3arg_w_gpt4o_both-sft
2B • Updated • 3
TAUR-dev/M-sft_exp_zayneV3_1e_cd3arg_w_gpt4o_ref-sft
2B • Updated • 3
TAUR-dev/M-SFTV2_V3_rl_RUN__gpt4o_ref-rl
Updated
TAUR-dev/M-SFTV2_V3_rl_RUN__gpt4o_both-rl
Updated
TAUR-dev/M-SFTV2_V3_rl_er_RUN__9_11-rl
Updated
TAUR-dev/M-sft_exp_zayneV2-sft
2B • Updated • 2
TAUR-dev/M-0911__0epoch_3args_dapo_nods_50epoch-rl
2B • Updated • 1
TAUR-dev/M-0911__0epoch_alltask_dapo_nods_50epoch-rl
Updated
TAUR-dev/M-SFTV2_all_V2_RUN__9_11_w_verdict_reward-rl
Updated