RLVR with Noisy Data uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-clean-epoch-3 8B • Updated Jan 30 uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-clean-epoch-4 8B • Updated Jan 30 uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-noise-0.1-epoch-3 8B • Updated Jan 30 uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-noise-0.2-epoch-3 8B • Updated Jan 30
RL Generalizability uiuc-kang-lab/R1-Distill-Qwen-1.5B-math-step-100 Updated Nov 14, 2025 uiuc-kang-lab/R1-Distill-Qwen-1.5B-math-dapo 2B • Updated Nov 15, 2025 uiuc-kang-lab/R1-Distill-Qwen-1.5B-math-epoch-12-6 2B • Updated Nov 16, 2025 • 3 uiuc-kang-lab/R1-Distill-Qwen-1.5B-math-epoch-10-6 2B • Updated Nov 16, 2025
RLVR with Noisy Data uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-clean-epoch-3 8B • Updated Jan 30 uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-clean-epoch-4 8B • Updated Jan 30 uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-noise-0.1-epoch-3 8B • Updated Jan 30 uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-noise-0.2-epoch-3 8B • Updated Jan 30
RL Generalizability uiuc-kang-lab/R1-Distill-Qwen-1.5B-math-step-100 Updated Nov 14, 2025 uiuc-kang-lab/R1-Distill-Qwen-1.5B-math-dapo 2B • Updated Nov 15, 2025 uiuc-kang-lab/R1-Distill-Qwen-1.5B-math-epoch-12-6 2B • Updated Nov 16, 2025 • 3 uiuc-kang-lab/R1-Distill-Qwen-1.5B-math-epoch-10-6 2B • Updated Nov 16, 2025