Tailoring Strictly Proper Scoring Rules for Downstream Tasks: An Application to Causal Inference Paper • 2606.03332 • Published 23 days ago • 1
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8, 2025 • 290
Ignore the KL Penalty! Boosting Exploration on Critical Tokens to Enhance RL Fine-Tuning Paper • 2502.06533 • Published Feb 10, 2025 • 17