MolmoWeb: Open Visual Web Agent and Open Data for the Open Web Paper • 2604.08516 • Published about 1 month ago • 43
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published 26 days ago • 90
ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents Paper • 2604.11784 • Published 27 days ago • 143
How to Fine-Tune a Reasoning Model? A Teacher-Student Cooperation Framework to Synthesize Student-Consistent SFT Data Paper • 2604.14164 • Published Mar 23 • 35
Programming with Data: Test-Driven Data Engineering for Self-Improving LLMs from Raw Corpora Paper • 2604.24819 • Published 13 days ago • 87
RADIO-ViPE: Online Tightly Coupled Multi-Modal Fusion for Open-Vocabulary Semantic SLAM in Dynamic Environments Paper • 2604.26067 • Published 12 days ago • 73
GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents Paper • 2604.26752 • Published 11 days ago • 102
Heterogeneous Scientific Foundation Model Collaboration Paper • 2604.27351 • Published 10 days ago • 208
Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding Paper • 2604.05015 • Published Apr 6 • 235
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver Paper • 2604.08377 • Published about 1 month ago • 289
ClawBench: Can AI Agents Complete Everyday Online Tasks? Paper • 2604.08523 • Published about 1 month ago • 262
WildDet3D: Scaling Promptable 3D Detection in the Wild Paper • 2604.08626 • Published about 1 month ago • 245
HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents Paper • 2604.07430 • Published Apr 8 • 187
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild Paper • 2603.17187 • Published Mar 17 • 139
view article Article π0 and π0-FAST: Vision-Language-Action Models for General Robot Control +2 Feb 4, 2025 • 192