From Narrow to Panoramic Vision: Attention-Guided Cold-Start Reshapes Multimodal Reasoning Paper • 2603.03825 • Published 7 days ago • 8
\$OneMillion-Bench: How Far are Language Agents from Human Experts? Paper • 2603.07980 • Published 2 days ago • 22
LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory Paper • 2603.03269 • Published 8 days ago • 43
CARE-Edit: Condition-Aware Routing of Experts for Contextual Image Editing Paper • 2603.08589 • Published 1 day ago • 30
Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence Paper • 2603.07660 • Published 3 days ago • 69
MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling Paper • 2511.11793 • Published Nov 14, 2025 • 189
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds Paper • 2511.08892 • Published Nov 12, 2025 • 213
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation Paper • 2511.14993 • Published Nov 19, 2025 • 233
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm Paper • 2511.04570 • Published Nov 6, 2025 • 241
The Trinity of Consistency as a Defining Principle for General World Models Paper • 2602.23152 • Published 13 days ago • 196
Beyond Binary Rewards: Training LMs to Reason About Their Uncertainty Paper • 2507.16806 • Published Jul 22, 2025 • 7
Region-based Cluster Discrimination for Visual Representation Learning Paper • 2507.20025 • Published Jul 26, 2025 • 19
Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs Paper • 2507.09477 • Published Jul 13, 2025 • 88
SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories? Paper • 2507.12415 • Published Jul 16, 2025 • 43
A Survey of Context Engineering for Large Language Models Paper • 2507.13334 • Published Jul 17, 2025 • 261
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning Paper • 2507.13348 • Published Jul 17, 2025 • 79
π^3: Scalable Permutation-Equivariant Visual Geometry Learning Paper • 2507.13347 • Published Jul 17, 2025 • 67
The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs Paper • 2507.11097 • Published Jul 15, 2025 • 64