guanning-ai/SmolLM-360M-RLOO-Math-Step1100
Updated
guanning-ai/SmolLM-360M-GRPO-Math-Step1100
Updated
guanning-ai/20260102-p_normalization_step4000
0.4B
•
Updated
•
122
guanning-ai/20260102-grpo_step4000
0.4B
•
Updated
•
90
guanning-ai/smollm-gsm8k-pnorm-ckpt4900
0.4B
•
Updated
•
11
guanning-ai/smollm-gsm8k-grpo-ckpt3900
0.4B
•
Updated
•
7
guanning-ai/smollm-gsm8k-grpo-ckpt1000
0.4B
•
Updated
•
202
guanning-ai/maze_sft_weights_1207
Updated
guanning-ai/1027-math4b-bz1024-pposz128-rollout4-seed20
Updated
guanning-ai/1024-1.5b-knk23-debug1004
Updated
guanning-ai/1024-jspo-4b-lr1e-6-bz64-pposz32-rollout4-seed6
Updated
guanning-ai/significance-test-1016
Updated
guanning-ai/gai-exp-qwen1.5b
Updated
guanning-ai/diveristy-judge-sft-0916
Updated
guanning-ai/diversity-judge-sft-0916
Updated
guanning-ai/dapo2k-models-0908-lr1e-5
Updated
guanning-ai/Qwen2.5-Math-7B-stage2-idw1.0-step150
8B
•
Updated
•
1
guanning-ai/Qwen2.5-Math-7B-stage2-idw1.0-step50
8B
•
Updated
•
1
guanning-ai/Qwen2.5-Math-7B-stage2-idw0.2-step50
8B
•
Updated
•
1
guanning-ai/Qwen2.5-Math-7B-stage2-idw0.05-step50
8B
•
Updated
•
1
guanning-ai/Qwen2.5-Math-7B-stage2-idw0.0-step50
8B
•
Updated
•
1