GRPO, Dr. GRPO, and DAPO Are Three Operations on One Number: The Group-Standard-Deviation Identity Paper • 2607.00152 • Published 4 days ago • 3
Silent Failures in Physical AI: A Literature Review of Runtime Action Authorization for Autonomous Systems Paper • 2606.00090 • Published May 23 • 6
Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players Paper • 2605.28816 • Published May 27 • 431
Leveraging Verifier-Based Reinforcement Learning in Image Editing Paper • 2604.27505 • Published Apr 30 • 59
Offline Evaluation Measures of Fairness in Recommender Systems Paper • 2604.25032 • Published Apr 27 • 2
RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time Paper • 2604.11626 • Published Apr 13 • 103