Qwen-AgentWorld: Language World Models for General Agents Paper • 2606.24597 • Published 13 days ago • 144
Beyond the Current Observation: Evaluating Multimodal Large Language Models in Controllable Non-Markov Games Paper • 2606.19338 • Published 19 days ago • 49
OVO-S-Bench: A Hierarchical Benchmark for Streaming Spatial Intelligence in Multimodal LLMs Paper • 2606.03890 • Published Jun 2 • 31
COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation Paper • 2605.31264 • Published May 29 • 123
SetCon: Towards Open-Ended Referring Segmentation via Set-Level Concept Prediction Paper • 2605.20110 • Published May 19 • 4
WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation Paper • 2605.10912 • Published May 11 • 46
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale Paper • 2603.25040 • Published Mar 26 • 134