Relit-LiVE: Relight Video by Jointly Learning Environment Video Paper • 2605.06658 • Published 5 days ago • 14
UniVidX: A Unified Multimodal Framework for Versatile Video Generation via Diffusion Priors Paper • 2605.00658 • Published 11 days ago • 80
Repurposing Geometric Foundation Models for Multi-view Diffusion Paper • 2603.22275 • Published Mar 23 • 47
Beyond Language Modeling: An Exploration of Multimodal Pretraining Paper • 2603.03276 • Published Mar 3 • 104
DREAM: Where Visual Understanding Meets Text-to-Image Generation Paper • 2603.02667 • Published Mar 3 • 6
UniG2U-Bench: Do Unified Models Advance Multimodal Understanding? Paper • 2603.03241 • Published Mar 3 • 87
Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking Paper • 2601.04720 • Published Jan 8 • 58
UltraShape 1.0: High-Fidelity 3D Shape Generation via Scalable Geometric Refinement Paper • 2512.21185 • Published Dec 24, 2025 • 32
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation Paper • 2512.23705 • Published Dec 29, 2025 • 45
Light-X: Generative 4D Video Rendering with Camera and Illumination Control Paper • 2512.05115 • Published Dec 4, 2025 • 11
NURBGen: High-Fidelity Text-to-CAD Generation through LLM-Driven NURBS Modeling Paper • 2511.06194 • Published Nov 9, 2025 • 13
Uniform Discrete Diffusion with Metric Path for Video Generation Paper • 2510.24717 • Published Oct 28, 2025 • 44
Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation Paper • 2509.19296 • Published Sep 23, 2025 • 32
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling Paper • 2509.12201 • Published Sep 15, 2025 • 107
Light of Normals: Unified Feature Representation for Universal Photometric Stereo Paper • 2506.18882 • Published Jun 23, 2025 • 89