xw1234gan/Merging_Qwen2.5-3B-Instruct_MATH_lr1e-05_mb2_ga128_n2048_seed42 Text Generation • 3B • Updated Mar 8 • 3
xw1234gan/GRPO_KL_Qwen2.5-1.5B-Instruct_MATH_beta0.01_lr1e-05_mb2_ga128_n2048_seed42 Text Generation • 2B • Updated Mar 6 • 3