VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control Paper • 2601.05138 • Published 25 days ago • 18
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published Dec 2, 2025 • 255
TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models Paper • 2512.02014 • Published Dec 1, 2025 • 73
Inferix: A Block-Diffusion based Next-Generation Inference Engine for World Simulation Paper • 2511.20714 • Published Nov 25, 2025 • 48
Kinematify: Open-Vocabulary Synthesis of High-DoF Articulated Objects Paper • 2511.01294 • Published Nov 3, 2025 • 14
Emu3.5: Native Multimodal Models are World Learners Paper • 2510.26583 • Published Oct 30, 2025 • 109
FlashWorld: High-quality 3D Scene Generation within Seconds Paper • 2510.13678 • Published Oct 15, 2025 • 73
FlashVSR: Towards Real-Time Diffusion-Based Streaming Video Super-Resolution Paper • 2510.12747 • Published Oct 14, 2025 • 39
InfiniHuman: Infinite 3D Human Creation with Precise Control Paper • 2510.11650 • Published Oct 13, 2025 • 6
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation Paper • 2510.08673 • Published Oct 9, 2025 • 126
Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels Paper • 2510.06499 • Published Oct 7, 2025 • 33
VideoCanvas: Unified Video Completion from Arbitrary Spatiotemporal Patches via In-Context Conditioning Paper • 2510.08555 • Published Oct 9, 2025 • 63
Triangle Splatting+: Differentiable Rendering with Opaque Triangles Paper • 2509.25122 • Published Sep 29, 2025 • 9
Rolling Forcing: Autoregressive Long Video Diffusion in Real Time Paper • 2509.25161 • Published Sep 29, 2025 • 26
Kling-Avatar: Grounding Multimodal Instructions for Cascaded Long-Duration Avatar Animation Synthesis Paper • 2509.09595 • Published Sep 11, 2025 • 48