Autodata: An agentic data scientist to create high quality synthetic data Paper • 2606.25996 • Published 4 days ago • 11
MemGUI-Agent: An End-to-End Long-Horizon Mobile GUI Agent with Proactive Context Management Paper • 2606.19926 • Published 10 days ago • 42
NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers? Paper • 2606.24530 • Published 5 days ago • 61
Qwen-AgentWorld: Language World Models for General Agents Paper • 2606.24597 • Published 5 days ago • 133
OPD-Evolver: Cultivating Holistic Agent Evolver via On-Policy Distillation Paper • 2606.17628 • Published 12 days ago • 27
EvoArena: Tracking Memory Evolution for Robust LLM Agents in Dynamic Environments Paper • 2606.13681 • Published 17 days ago • 142
FORT-Searcher: Synthesizing Shortcut-Resistant Search Tasks for Training Deep Search Agents Paper • 2606.12087 • Published 18 days ago • 77
Toward Generalist Autonomous Research via Hypothesis-Tree Refinement Paper • 2606.11926 • Published 18 days ago • 120
From Prompt Injection to Persistent Control: Defending Agentic Harness Against Trojan Backdoors Paper • 2605.31042 • Published 30 days ago • 19
AgentFugue: Agent Scaling for Long-Horizon Tasks through Collective Reasoning Paper • 2605.24486 • Published May 23 • 6
SAM: State-Adaptive Memory for Long-Horizon Reasoning Agent Paper • 2605.24468 • Published May 23 • 9
PlanningBench: Generating Scalable and Verifiable Planning Data for Evaluating and Training Large Language Models Paper • 2605.20873 • Published May 20 • 44
ClawGym: A Scalable Framework for Building Effective Claw Agents Paper • 2604.26904 • Published Apr 29 • 54
AutoResearchBench: Benchmarking AI Agents on Complex Scientific Literature Discovery Paper • 2604.25256 • Published Apr 28 • 30
Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence Paper • 2604.18292 • Published Apr 20 • 88
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild Paper • 2603.17187 • Published Mar 17 • 141
OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data Paper • 2603.15594 • Published Mar 16 • 150
MemSifter: Offloading LLM Memory Retrieval via Outcome-Driven Proxy Reasoning Paper • 2603.03379 • Published Mar 3 • 32