RLVR Generalization Bounds uiuc-kang-lab/RLVR-General-Mix Viewer • Updated Apr 19 • 439k • 8 uiuc-kang-lab/RLVR-SynSQL-2.5M Viewer • Updated Apr 19 • 2.54M • 6 uiuc-kang-lab/RLVR-Code-Mix Viewer • Updated Apr 19 • 505k • 12 uiuc-kang-lab/RLVR-Eurus-2-Math-Fixed Viewer • Updated Mar 27 • 455k • 7
RL Generalizability uiuc-kang-lab/R1-Distill-Qwen-1.5B-math-dapo 2B • Updated Nov 15, 2025 • 1 uiuc-kang-lab/R1-Distill-Qwen-1.5B-math-epoch-12-6 2B • Updated Nov 16, 2025 • 3 uiuc-kang-lab/R1-Distill-Qwen-1.5B-math-epoch-10-6 2B • Updated Nov 16, 2025 • 2 uiuc-kang-lab/R1-Distill-Qwen-1.5B-math-epoch-8-6 2B • Updated Nov 16, 2025 • 2
RLVR with Noisy Data uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-clean-epoch-3 8B • Updated Jan 30 • 2 uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-clean-epoch-4 8B • Updated Jan 30 • 3 uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-noise-0.1-epoch-3 8B • Updated Jan 30 • 1 uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-noise-0.2-epoch-3 8B • Updated Jan 30 • 4
RLVR Generalization Bounds uiuc-kang-lab/RLVR-General-Mix Viewer • Updated Apr 19 • 439k • 8 uiuc-kang-lab/RLVR-SynSQL-2.5M Viewer • Updated Apr 19 • 2.54M • 6 uiuc-kang-lab/RLVR-Code-Mix Viewer • Updated Apr 19 • 505k • 12 uiuc-kang-lab/RLVR-Eurus-2-Math-Fixed Viewer • Updated Mar 27 • 455k • 7
RLVR with Noisy Data uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-clean-epoch-3 8B • Updated Jan 30 • 2 uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-clean-epoch-4 8B • Updated Jan 30 • 3 uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-noise-0.1-epoch-3 8B • Updated Jan 30 • 1 uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-noise-0.2-epoch-3 8B • Updated Jan 30 • 4
RL Generalizability uiuc-kang-lab/R1-Distill-Qwen-1.5B-math-dapo 2B • Updated Nov 15, 2025 • 1 uiuc-kang-lab/R1-Distill-Qwen-1.5B-math-epoch-12-6 2B • Updated Nov 16, 2025 • 3 uiuc-kang-lab/R1-Distill-Qwen-1.5B-math-epoch-10-6 2B • Updated Nov 16, 2025 • 2 uiuc-kang-lab/R1-Distill-Qwen-1.5B-math-epoch-8-6 2B • Updated Nov 16, 2025 • 2