SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization Paper • 2604.02268 • Published 9 days ago • 92
V_{0.5}: Generalist Value Model as a Prior for Sparse RL Rollouts Paper • 2603.10848 • Published about 1 month ago • 14
CoBA-RL: Capability-Oriented Budget Allocation for Reinforcement Learning in LLMs Paper • 2602.03048 • Published Feb 3 • 32
MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning Paper • 2601.21468 • Published Jan 29 • 25
Examining False Positives under Inference Scaling for Mathematical Reasoning Paper • 2502.06217 • Published Feb 10, 2025 • 1