CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance Paper • 2503.10391 • Published Mar 13, 2025 • 12
MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation Paper • 2503.14428 • Published Mar 18, 2025 • 8
OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation Paper • 2505.20292 • Published May 26, 2025 • 52
Focal Guidance: Unlocking Controllability from Semantic-Weak Layers in Video Diffusion Models Paper • 2601.07287 • Published 2 days ago
MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head Paper • 2601.07832 • Published 2 days ago • 37
MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head Paper • 2601.07832 • Published 2 days ago • 37
VARGPT: Unified Understanding and Generation in a Visual Autoregressive Multimodal Large Language Model Paper • 2501.12327 • Published Jan 21, 2025
VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement Learning Paper • 2504.02949 • Published Apr 3, 2025 • 21
MAGREF: Masked Guidance for Any-Reference Video Generation Paper • 2505.23742 • Published May 29, 2025 • 11
MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head Paper • 2601.07832 • Published 2 days ago • 37
CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance Paper • 2503.10391 • Published Mar 13, 2025 • 12