Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space Paper • 2512.24617 • Published 29 days ago • 61
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos Paper • 2601.00393 • Published 28 days ago • 130
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders Paper • 2601.16208 • Published 6 days ago • 51
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published 20 days ago • 210
NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and Generation Paper • 2601.02204 • Published 24 days ago • 60
InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields Paper • 2601.03252 • Published 22 days ago • 99
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published Dec 18, 2025 • 94
The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding Paper • 2512.19693 • Published Dec 22, 2025 • 64
view article Article Backbone-Optimizer Coupling Bias: The Hidden Co-Design Principle Dec 20, 2025 • 3
Next-Embedding Prediction Makes Strong Vision Learners Paper • 2512.16922 • Published Dec 18, 2025 • 84
Depth Any Panoramas: A Foundation Model for Panoramic Depth Estimation Paper • 2512.16913 • Published Dec 18, 2025 • 34
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published Dec 2, 2025 • 254
MG-Nav: Dual-Scale Visual Navigation via Sparse Spatial Memory Paper • 2511.22609 • Published Nov 27, 2025 • 49
Envision: Benchmarking Unified Understanding & Generation for Causal World Process Insights Paper • 2512.01816 • Published Dec 1, 2025 • 92
Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward Paper • 2511.20561 • Published Nov 25, 2025 • 32
Lost in Tokenization: Context as the Key to Unlocking Biomolecular Understanding in Scientific LLMs Paper • 2510.23127 • Published Oct 27, 2025 • 5
MergeMix: A Unified Augmentation Paradigm for Visual and Multi-Modal Understanding Paper • 2510.23479 • Published Oct 27, 2025 • 15