Qwen-AgentWorld: Language World Models for General Agents Paper • 2606.24597 • Published 2 days ago • 99
WeaveBench: A Long-Horizon, Real-World Benchmark for Computer-Use Agents with Hybrid Interfaces Paper • 2606.09426 • Published 17 days ago • 102
MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research Paper • 2605.26114 • Published May 25 • 65
SkillOpt: Executive Strategy for Self-Evolving Agent Skills Paper • 2605.23904 • Published May 22 • 246
WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation Paper • 2605.10912 • Published May 11 • 46
ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents Paper • 2605.12481 • Published May 12 • 28
Nemotron 3 Nano Omni: Efficient and Open Multimodal Intelligence Paper • 2604.24954 • Published Apr 27 • 26
Synthetic Computers at Scale for Long-Horizon Productivity Simulation Paper • 2604.28181 • Published Apr 30 • 20
TCOD: Exploring Temporal Curriculum in On-Policy Distillation for Multi-turn Autonomous Agents Paper • 2604.24005 • Published Apr 27 • 9
Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence Paper • 2604.18292 • Published Apr 20 • 87
GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents Paper • 2604.07429 • Published Apr 8 • 123
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published Apr 8 • 328
ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents Paper • 2604.11784 • Published Apr 13 • 143
DARE: Diffusion Large Language Models Alignment and Reinforcement Executor Paper • 2604.04215 • Published Apr 5 • 22
ACE-Brain-0: Spatial Intelligence as a Shared Scaffold for Universal Embodiments Paper • 2603.03198 • Published Mar 3 • 4
Youtu-VL: Unleashing Visual Potential via Unified Vision-Language Supervision Paper • 2601.19798 • Published Jan 27 • 44