DC-VideoGen: Efficient Video Generation with Deep Compression Video Autoencoder Paper • 2509.25182 • Published Sep 29, 2025 • 38
SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer Paper • 2509.24695 • Published Sep 29, 2025 • 45
Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models Paper • 2211.02048 • Published Nov 3, 2022
SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer Paper • 2501.18427 • Published Jan 30, 2025 • 24
Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity Paper • 2502.01776 • Published Feb 3, 2025 • 3
LongLive: Real-time Interactive Long Video Generation Paper • 2509.22622 • Published Sep 26, 2025 • 187
Radial Attention: $O(n\log n)$ Sparse Attention with Energy Decay for Long Video Generation Paper • 2506.19852 • Published Jun 24, 2025 • 42
SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation Paper • 2305.17011 • Published May 26, 2023
GrootVL: Tree Topology is All You Need in State Space Model Paper • 2406.02395 • Published Jun 4, 2024 • 1
COVE: Unleashing the Diffusion Feature Correspondence for Consistent Video Editing Paper • 2406.08850 • Published Jun 13, 2024
HaploVL: A Single-Transformer Baseline for Multi-Modal Understanding Paper • 2503.14694 • Published Mar 12, 2025
MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO Paper • 2505.13031 • Published May 19, 2025 • 4
HaploOmni: Unified Single Transformer for Multimodal Video Understanding and Generation Paper • 2506.02975 • Published Jun 3, 2025
Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation Paper • 2505.18875 • Published May 24, 2025 • 42
H$^{\mathbf{3}}$DP: Triply-Hierarchical Diffusion Policy for Visuomotor Learning Paper • 2505.07819 • Published May 12, 2025 • 5
Condition-Aware Neural Network for Controlled Image Generation Paper • 2404.01143 • Published Apr 1, 2024 • 13
Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models Paper • 2410.10733 • Published Oct 14, 2024 • 9
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers Paper • 2410.10629 • Published Oct 14, 2024 • 12
Lite Pose: Efficient Architecture Design for 2D Human Pose Estimation Paper • 2205.01271 • Published May 3, 2022
SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models Paper • 2411.05007 • Published Nov 7, 2024 • 24