MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing Paper • 2509.22186 • Published Sep 26, 2025 • 149
SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks? Paper • 2509.16941 • Published Sep 21, 2025 • 21
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency Paper • 2508.18265 • Published Aug 25, 2025 • 214
Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance Paper • 2507.22448 • Published Jul 30, 2025 • 70
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 10 items • Updated 12 days ago • 557
VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning Paper • 2507.22607 • Published Jul 30, 2025 • 47
DesignLab: Designing Slides Through Iterative Detection and Correction Paper • 2507.17202 • Published Jul 23, 2025 • 51
Beyond Context Limits: Subconscious Threads for Long-Horizon Reasoning Paper • 2507.16784 • Published Jul 22, 2025 • 122
A Simple "Try Again" Can Elicit Multi-Turn LLM Reasoning Paper • 2507.14295 • Published Jul 18, 2025 • 14
Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning Paper • 2507.14137 • Published Jul 18, 2025 • 36