TAUR-dev/M-skillfactory_sft_combinedtasks_promptvariants_qrepeat1_reflections5-sft
2B • Updated • 2
TAUR-dev/M-skillfactory_sft_combinedtasks_promptvariants_qrepeat1_reflections3-sft
2B • Updated • 1
TAUR-dev/M-skillfactory_sft_countdown_3arg_promptvariants_qrepeat1_reflections3-sft
2B • Updated • 1
TAUR-dev/M-skillfactory_sft_countdown_3arg_promptvariants_qrepeat3_reflections5-sft
2B • Updated • 1
TAUR-dev/M-skillfactory_sft_countdown_3arg_promptvariants_qrepeat1_reflections5-sft
2B • Updated • 2
TAUR-dev/M-skillfactory_sft_countdown_3arg_promptvariants_qrepeat3_reflections3-sft
2B • Updated • 1
2B • Updated • 2
TAUR-dev/M-0903_rl_reflect__1e_complex0.3_3args__grpo_minibs32_lr1e-6_rollout16-rl
2B • Updated • 2
TAUR-dev/M-0903_rl_reflect__0epoch_3args__grpo_minibs32_lr1e-6_rollout16-rl
2B • Updated • 1
TAUR-dev/M-0903_rl_reflect__2b_complex0.3_3and4args__grpo_minibs32_lr1e-6_rollout16-rl
TAUR-dev/M-0903_rl_reflect__3b_complex0.3_alltask__grpo_minibs32_lr1e-6_rollout16-rl
Updated
TAUR-dev/M-0903_rl_reflect__1a_3args__grpo_minibs32_lr1e-6_rollout16-rl
2B • Updated • 2
TAUR-dev/M-0903_rl_reflect__0epoch_3and4args__grpo_minibs32_lr1e-6_rollout16-rl
2B • Updated • 2
TAUR-dev/M-0903_rl_reflect__0epoch_alltask__grpo_minibs32_lr1e-6_rollout16_32GPU-rl
Updated
TAUR-dev/M-0903_rl_reflect__2b_3and4args__grpo_minibs32_lr1e-6_rollout16-rl
Updated
TAUR-dev/M-0903_rl_reflect__2a_3and4args__grpo_minibs32_lr1e-6_rollout16-rl
Updated
TAUR-dev/M-0903_rl_reflect__3b_alltask__grpo_minibs32_lr1e-6_rollout16-rl
Updated
TAUR-dev/M-0903_rl_reflect__3a_alltask__grpo_minibs32_lr1e-6_rollout16-rl
Updated
TAUR-dev/M-0903_rl_reflect__0epoch_alltask__grpo_minibs32_lr1e-6_rollout16-rl
Updated
TAUR-dev/M-0903_rl_reflect__1d_3args__grpo_minibs32_lr1e-6_rollout16-rl
2B • Updated • 3
TAUR-dev/M-0903_rl_reflect__1f_3args__grpo_minibs32_lr1e-6_rollout16-rl
2B • Updated • 3
TAUR-dev/M-0903_rl_reflect__1c_3args__grpo_minibs32_lr1e-6_rollout16-rl
2B • Updated • 3
TAUR-dev/M-0903_rl_reflect__1e_3args__grpo_minibs32_lr1e-6_rollout16-rl
2B • Updated • 3
TAUR-dev/M-0903_rl_reflect__1a_1epoch_3args__grpo_minibs32_lr1e-6_rollout16-rl
2B • Updated • 1
TAUR-dev/M-0903_rl_reflect__1b_3args__grpo_minibs32_lr1e-6_rollout16-rl
2B • Updated • 3
TAUR-dev/M-skillfactory_yolo_3b-sft
2B • Updated • 3
TAUR-dev/M-skillfactory_yolo_2b-sft
2B • Updated • 2
TAUR-dev/M-skillfactory_yolo_3a_sft
2B • Updated • 1
TAUR-dev/M-skillfactory_yolo_2a-sft
2B • Updated • 2
TAUR-dev/M-skillfactory_yolo_1d_sft-sft
2B • Updated • 2