Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control and Planning Paper • 2601.16163 • Published 4 days ago • 13
FrankenMotion: Part-level Human Motion Generation and Composition Paper • 2601.10909 • Published 11 days ago • 18
Unlocking Implicit Experience: Synthesizing Tool-Use Trajectories from Text Paper • 2601.10355 • Published 11 days ago • 38
Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding Paper • 2601.10611 • Published 11 days ago • 26
DanQing: An Up-to-Date Large-Scale Chinese Vision-Language Pre-training Dataset Paper • 2601.10305 • Published 11 days ago • 36
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published 18 days ago • 208
NitroGen: An Open Foundation Model for Generalist Gaming Agents Paper • 2601.02427 • Published 22 days ago • 42
LTX-2: Efficient Joint Audio-Visual Foundation Model Paper • 2601.03233 • Published 20 days ago • 134
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos Paper • 2601.00393 • Published 25 days ago • 128
PhyGDPO: Physics-Aware Groupwise Direct Preference Optimization for Physically Consistent Text-to-Video Generation Paper • 2512.24551 • Published 27 days ago • 19
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published Dec 18, 2025 • 94
DreaMontage: Arbitrary Frame-Guided One-Shot Video Generation Paper • 2512.21252 • Published Dec 24, 2025 • 35