CLIPO: Contrastive Learning in Policy Optimization Generalizes RLVR Paper • 2603.10101 • Published 3 days ago • 3
AutoResearch-RL: Perpetual Self-Evaluating Reinforcement Learning Agents for Autonomous Neural Architecture Discovery Paper • 2603.07300 • Published 6 days ago • 14
π-StepNFT: Wider Space Needs Finer Steps in Online RL for Flow-based VLAs Paper • 2603.02083 • Published 11 days ago • 9
Track4World: Feedforward World-centric Dense 3D Tracking of All Pixels Paper • 2603.02573 • Published 10 days ago • 11
CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation Paper • 2602.24286 • Published 14 days ago • 85
Risk-Aware World Model Predictive Control for Generalizable End-to-End Autonomous Driving Paper • 2602.23259 • Published 15 days ago • 2
EmbodMocap: In-the-Wild 4D Human-Scene Reconstruction for Embodied Agents Paper • 2602.23205 • Published 15 days ago • 11
Generated Reality: Human-centric World Simulation using Interactive Video Generation with Hand and Camera Control Paper • 2602.18422 • Published 21 days ago • 30
SoMA: A Real-to-Sim Neural Simulator for Robotic Soft-body Manipulation Paper • 2602.02402 • Published Feb 2 • 32
WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning Paper • 2602.04634 • Published Feb 4 • 96
Bridging Online and Offline RL: Contextual Bandit Learning for Multi-Turn Code Generation Paper • 2602.03806 • Published Feb 3 • 5