A Survey of Reinforcement Learning for Large Reasoning Models Paper • 2509.08827 • Published Sep 10, 2025 • 190
Agent Lightning: Train ANY AI Agents with Reinforcement Learning Paper • 2508.03680 • Published Aug 5, 2025 • 131
Aligning Multimodal LLM with Human Preference: A Survey Paper • 2503.14504 • Published Mar 18, 2025 • 26