GD^2PO: Mitigating Multi-Reward Conflicts via Group-Dynamic reward-Decoupled Policy Optimization Paper • 2606.16771 • Published 10 days ago • 13
GD^2PO: Mitigating Multi-Reward Conflicts via Group-Dynamic reward-Decoupled Policy Optimization Paper • 2606.16771 • Published 10 days ago • 13
MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling Paper • 2606.13473 • Published 14 days ago • 90
Skill-RM: Unifying Heterogeneous Evaluation Criteria via Agent Skill Paper • 2606.03980 • Published 23 days ago • 13
Skill-RM: Unifying Heterogeneous Evaluation Criteria via Agent Skill Paper • 2606.03980 • Published 23 days ago • 13
Skill-RM: Unifying Heterogeneous Evaluation Criteria via Agent Skill Paper • 2606.03980 • Published 23 days ago • 13
SceneCode: Executable World Programs for Editable Indoor Scenes with Articulated Objects Paper • 2605.19587 • Published May 19 • 10
SubtleMemory: A Benchmark for Fine-Grained Relational Memory Discrimination in Long-Horizon AI Agents Paper • 2606.05761 • Published 21 days ago • 19
Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling Paper • 2605.13301 • Published May 13 • 165
Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMs Paper • 2605.00814 • Published May 1 • 21
Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMs Paper • 2605.00814 • Published May 1 • 21
VideoSSR: Video Self-Supervised Reinforcement Learning Paper • 2511.06281 • Published Nov 9, 2025 • 25
DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models Paper • 2512.24165 • Published Dec 30, 2025 • 52
LatentMem: Customizing Latent Memory for Multi-Agent Systems Paper • 2602.03036 • Published Feb 3 • 16
GEMS: Agent-Native Multimodal Generation with Memory and Skills Paper • 2603.28088 • Published Mar 30 • 87
GEMS: Agent-Native Multimodal Generation with Memory and Skills Paper • 2603.28088 • Published Mar 30 • 87