Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone Paper • 2512.22615 • Published Dec 27, 2025 • 48
Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies Paper • 2512.19673 • Published Dec 22, 2025 • 64
SemanticGen: Video Generation in Semantic Space Paper • 2512.20619 • Published Dec 23, 2025 • 93
Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows Paper • 2512.16969 • Published Dec 18, 2025 • 119
QuCo-RAG: Quantifying Uncertainty from the Pre-training Corpus for Dynamic Retrieval-Augmented Generation Paper • 2512.19134 • Published Dec 22, 2025 • 32
Spatial-Aware VLA Pretraining through Visual-Physical Alignment from Human Videos Paper • 2512.13080 • Published Dec 15, 2025 • 17
HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices Paper • 2512.14052 • Published Dec 16, 2025 • 42
RePo: Language Models with Context Re-Positioning Paper • 2512.14391 • Published Dec 16, 2025 • 10
PaperDebugger: A Plugin-Based Multi-Agent System for In-Editor Academic Writing, Review, and Editing Paper • 2512.02589 • Published Dec 2, 2025 • 71
Video-Thinker: Sparking "Thinking with Videos" via Reinforcement Learning Paper • 2510.23473 • Published Oct 27, 2025 • 85
Attention Is All You Need for KV Cache in Diffusion LLMs Paper • 2510.14973 • Published Oct 16, 2025 • 42
ERA: Transforming VLMs into Embodied Agents via Embodied Prior Learning and Online Reinforcement Learning Paper • 2510.12693 • Published Oct 14, 2025 • 28
Multimodal Prompt Optimization: Why Not Leverage Multiple Modalities for MLLMs Paper • 2510.09201 • Published Oct 10, 2025 • 50