DragMesh-2: Physically Plausible Dexterous Hand-Object Interaction with Articulated Objects Paper ⢠2606.15133 ⢠Published 19 days ago ⢠74
Talk2Move: Reinforcement Learning for Text-Instructed Object-Level Geometric Transformation in Scenes Paper ⢠2601.02356 ⢠Published Jan 5 ⢠14
SS4D: Native 4D Generative Model via Structured Spacetime Latents Paper ⢠2512.14284 ⢠Published Dec 16, 2025 ⢠14
ViSAudio: End-to-End Video-Driven Binaural Spatial Audio Generation Paper ⢠2512.03036 ⢠Published Dec 2, 2025 ⢠22
Hi3DEval: Advancing 3D Generation Evaluation with Hierarchical Validity Paper ⢠2508.05609 ⢠Published Aug 7, 2025 ⢠29
One-Minute Video Generation with Test-Time Training Paper ⢠2504.05298 ⢠Published Apr 7, 2025 ⢠110
OmniSVG: A Unified Scalable Vector Graphics Generation Model Paper ⢠2504.06263 ⢠Published Apr 8, 2025 ⢠186
GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography Paper ⢠2504.07083 ⢠Published Apr 9, 2025 ⢠22
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning Paper ⢠2503.15558 ⢠Published Mar 18, 2025 ⢠51
Unleashing Vecset Diffusion Model for Fast Shape Generation Paper ⢠2503.16302 ⢠Published Mar 20, 2025 ⢠43
TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models Paper ⢠2502.06608 ⢠Published Feb 10, 2025 ⢠39
SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation Paper ⢠2502.13128 ⢠Published Feb 18, 2025 ⢠41
Cosmos-Preidct1 Collection ā ļø This collection is archived. š https://huggingface.co/collections/nvidia/cosmos3 ⢠14 items ⢠Updated 20 days ago ⢠304
BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning Paper ⢠2501.03226 ⢠Published Jan 6, 2025 ⢠43
Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction Paper ⢠2501.03218 ⢠Published Jan 6, 2025 ⢠35
IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations Paper ⢠2412.12083 ⢠Published Dec 16, 2024 ⢠12
FiVA: Fine-grained Visual Attribute Dataset for Text-to-Image Diffusion Models Paper ⢠2412.07674 ⢠Published Dec 10, 2024 ⢠20
Imagine360: Immersive 360 Video Generation from Perspective Anchor Paper ⢠2412.03552 ⢠Published Dec 4, 2024 ⢠29
StdGEN: Semantic-Decomposed 3D Character Generation from Single Images Paper ⢠2411.05738 ⢠Published Nov 8, 2024 ⢠14