VisPhyWorld: Probing Physical Reasoning via Code-Driven Video Reconstruction Paper • 2602.13294 • Published Feb 9 • 13
VideoMaMa: Mask-Guided Video Matting via Generative Prior Paper • 2601.14255 • Published Jan 20 • 15
Running on Zero Featured 934 MMAudio — generating synchronized audio from video/text 🔊 934 Generate synchronized audio for videos from text prompts
4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation Paper • 2512.17012 • Published Dec 18, 2025 • 47
Diffusion Transformers with Representation Autoencoders Paper • 2510.11690 • Published Oct 13, 2025 • 168