GEM: Generative Supervision Helps Embodied Intelligence Paper β’ 2605.28548 β’ Published about 1 month ago β’ 32
minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models Paper β’ 2605.30263 β’ Published 30 days ago β’ 59
Causal Forcing++: Scalable Few-Step Autoregressive Diffusion Distillation for Real-Time Interactive Video Generation Paper β’ 2605.15141 β’ Published May 14 β’ 96
Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training Paper β’ 2603.12255 β’ Published Mar 12 β’ 91
Part-X-MLLM: Part-aware 3D Multimodal Large Language Model Paper β’ 2511.13647 β’ Published Nov 17, 2025 β’ 72
NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks Paper β’ 2510.15019 β’ Published Oct 16, 2025 β’ 65
ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and Understanding Paper β’ 2506.01853 β’ Published Jun 2, 2025 β’ 32
Video-T1: Test-Time Scaling for Video Generation Paper β’ 2503.18942 β’ Published Mar 24, 2025 β’ 90
DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning Paper β’ 2503.15265 β’ Published Mar 19, 2025 β’ 46
ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model Paper β’ 2408.16767 β’ Published Aug 29, 2024 β’ 32
Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion Paper β’ 2406.04338 β’ Published Jun 6, 2024 β’ 39
BeyondScene: Higher-Resolution Human-Centric Scene Generation With Pretrained Diffusion Paper β’ 2404.04544 β’ Published Apr 6, 2024 β’ 23
FlexiDreamer: Single Image-to-3D Generation with FlexiCubes Paper β’ 2404.00987 β’ Published Apr 1, 2024 β’ 23
DreamReward: Text-to-3D Generation with Human Preference Paper β’ 2403.14613 β’ Published Mar 21, 2024 β’ 37