SkillLearnBench: Benchmarking Continual Learning Methods for Agent Skill Generation on Real-World Tasks Paper • 2604.20087 • Published 14 days ago • 15
On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models Paper • 2512.07783 • Published Dec 8, 2025 • 40
Learning to Act and Cooperate for Distributed Black-Box Consensus Optimization Paper • 2605.00691 • Published 5 days ago • 2
ClawBench: Can AI Agents Complete Everyday Online Tasks? Paper • 2604.08523 • Published 27 days ago • 261
DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference Paper • 2602.21548 • Published Feb 25 • 51
ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents Paper • 2604.11784 • Published 23 days ago • 143
DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing Paper • 2602.12205 • Published Feb 12 • 83
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver Paper • 2604.08377 • Published 27 days ago • 289
Omni-WorldBench: Towards a Comprehensive Interaction-Centric Evaluation for World Models Paper • 2603.22212 • Published Mar 23 • 126
Context-Value-Action Architecture for Value-Driven Large Language Model Agents Paper • 2604.05939 • Published 29 days ago • 9
Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents Paper • 2604.06132 • Published 29 days ago • 119
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper • 2604.04921 • Published about 1 month ago • 112