Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex Paper • 2605.06139 • Published 6 days ago • 62
Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers Paper • 2605.06169 • Published 6 days ago • 116
Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning Paper • 2605.06130 • Published 6 days ago • 92
Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction Paper • 2605.05242 • Published 10 days ago • 92
SkillOS: Learning Skill Curation for Self-Evolving Agents Paper • 2605.06614 • Published 6 days ago • 37
OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents Paper • 2605.05185 • Published 7 days ago • 95
ARIS: Autonomous Research via Adversarial Multi-Agent Collaboration Paper • 2605.03042 • Published 9 days ago • 107
MolmoAct2: Action Reasoning Models for Real-world Deployment Paper • 2605.02881 • Published 9 days ago • 286
World-R1: Reinforcing 3D Constraints for Text-to-Video Generation Paper • 2604.24764 • Published 16 days ago • 117
AgentSearchBench: A Benchmark for AI Agent Search in the Wild Paper • 2604.22436 • Published 19 days ago • 14
Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond Paper • 2604.22748 • Published 19 days ago • 226
DR-Venus: Towards Frontier Edge-Scale Deep Research Agents with Only 10K Open Data Paper • 2604.19859 • Published 22 days ago • 51
SkillLearnBench: Benchmarking Continual Learning Methods for Agent Skill Generation on Real-World Tasks Paper • 2604.20087 • Published 21 days ago • 15
view article Article How to Ground a Korean AI Agent in Real Demographics with Synthetic Personas nvidia • 22 days ago • 25
SkillFlow:Benchmarking Lifelong Skill Discovery and Evolution for Autonomous Agents Paper • 2604.17308 • Published 24 days ago • 22