paper maybe useful
updated
Light-A-Video: Training-free Video Relighting via Progressive Light
Fusion
Paper
• 2502.08590
• Published
• 42
Distillation Scaling Laws
Paper
• 2502.08606
• Published
• 47
Soundwave: Less is More for Speech-Text Alignment in LLMs
Paper
• 2502.12900
• Published
• 86
Alias-Free Latent Diffusion Models:Improving Fractional Shift
Equivariance of Diffusion Latent Space
Paper
• 2503.09419
• Published
• 6
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video
Paper
• 2503.11647
• Published
• 146
Can Vision-Language Models Answer Face to Face Questions in the
Real-World?
Paper
• 2503.19356
• Published
• 2
Self-Supervised Learning of Motion Concepts by Optimizing
Counterfactuals
Paper
• 2503.19953
• Published
• 3
World Modeling Makes a Better Planner: Dual Preference Optimization for
Embodied Task Planning
Paper
• 2503.10480
• Published
• 56
TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos
via Diffusion Models
Paper
• 2503.05638
• Published
• 20
Video-R1: Reinforcing Video Reasoning in MLLMs
Paper
• 2503.21776
• Published
• 79
Segment Any Motion in Videos
Paper
• 2503.22268
• Published
• 19
Token-Shuffle: Towards High-Resolution Image Generation with
Autoregressive Models
Paper
• 2504.17789
• Published
• 23
Reinforcement Pre-Training
Paper
• 2506.08007
• Published
• 263
Dreamland: Controllable World Creation with Simulator and Generative
Models
Paper
• 2506.08006
• Published
• 7
Seeing Voices: Generating A-Roll Video from Audio with Mirage
Paper
• 2506.08279
• Published
• 27
PlayerOne: Egocentric World Simulator
Paper
• 2506.09995
• Published
• 34
Video models are zero-shot learners and reasoners
Paper
• 2509.20328
• Published
• 100