Enhancing Efficiency and Exploration in Reinforcement Learning for LLMs Paper • 2505.18573 • Published May 24, 2025
Can Tool-Integrated Reinforcement Learning Generalize Across Diverse Domains? Paper • 2510.11184 • Published Oct 13, 2025
Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective Paper • 2505.17652 • Published May 23, 2025 • 6
SampleMix: A Sample-wise Pre-training Data Mixing Strategey by Coordinating Data Quality and Diversity Paper • 2503.01506 • Published Mar 3, 2025 • 10