QG-MIL: A Gated Transformer Aggregator for Domain-Agnostic Multiple Instance Learning in Medical Imaging Paper • 2606.20027 • Published 7 days ago • 2
LingxiDiagBench: A Multi-Agent Framework for Benchmarking LLMs in Chinese Psychiatric Consultation and Diagnosis Paper • 2602.09379 • Published 14 days ago • 19
Qwen-AgentWorld: Language World Models for General Agents Paper • 2606.24597 • Published 1 day ago • 76
Deep Research in Physical Sciences: A Multi-Agent Framework and Comprehensive Benchmark Paper • 2606.18648 • Published 8 days ago • 14
PlanBench-XL: Evaluating Long-Horizon Planning of LLM Tool-Use Agents in Large-Scale Tool Ecosystems Paper • 2606.22388 • Published 4 days ago • 85
PerceptionDLM: Parallel Region Perception with Multimodal Diffusion Language Models Paper • 2606.19534 • Published 8 days ago • 58
FAPO: Fully Autonomous Prompt Optimization of Multi-Step LLM Pipelines Paper • 2606.19605 • Published 8 days ago • 10
Beyond Static Leaderboards: Predictive Validity for the Evaluation of LLM Agents Paper • 2606.19704 • Published 7 days ago • 39
S-Agent: Spatial Tool-Use Elicits Reasoning for Spatial Intelligence Paper • 2606.20515 • Published 7 days ago • 39
ViT-Up: Faithful Feature Upsampling for Vision Transformers Paper • 2606.14024 • Published 13 days ago • 9
SciOrch: Learning to Orchestrate Expert LLMs for Solving Frontier Multimodal Scientific Reasoning Tasks Paper • 2606.15872 • Published 10 days ago • 8
Native Active Perception as Reasoning for Omni-Modal Understanding Paper • 2606.19341 • Published 8 days ago • 17
From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning Paper • 2606.17682 • Published 9 days ago • 26
LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling Paper • 2606.18023 • Published 9 days ago • 203