V-Reflection: Transforming MLLMs from Passive Observers to Active Interrogators Paper • 2604.03307 • Published 8 days ago • 13
StereoPilot: Learning Unified and Efficient Stereo Conversion via Generative Priors Paper • 2512.16915 • Published Dec 18, 2025 • 38
FAPO: Flawed-Aware Policy Optimization for Efficient and Reliable Reasoning Paper • 2510.22543 • Published Oct 26, 2025 • 14
Agent Lightning: Train ANY AI Agents with Reinforcement Learning Paper • 2508.03680 • Published Aug 5, 2025 • 140