long-video

classroom

AI & ML interests

None defined yet.

authored a paper 3 months ago

SpatialEdit: Benchmarking Fine-Grained Image Spatial Editing

Paper • 2604.04911 • Published Apr 6 • 36

submitted a paper to Daily Papers 3 months ago

SpatialEdit: Benchmarking Fine-Grained Image Spatial Editing

Paper • 2604.04911 • Published Apr 6 • 36

authored 6 papers 9 months ago

DC-VideoGen: Efficient Video Generation with Deep Compression Video Autoencoder

Paper • 2509.25182 • Published Sep 29, 2025 • 39

SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer

Paper • 2509.24695 • Published Sep 29, 2025 • 54

Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models

Paper • 2211.02048 • Published Nov 3, 2022

SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer

Paper • 2501.18427 • Published Jan 30, 2025 • 27

Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity

Paper • 2502.01776 • Published Feb 3, 2025 • 3

LongLive: Real-time Interactive Long Video Generation

Paper • 2509.22622 • Published Sep 26, 2025 • 189

authored a paper about 1 year ago

Radial Attention: $O(n\log n)$ Sparse Attention with Energy Decay for Long Video Generation

Paper • 2506.19852 • Published Jun 24, 2025 • 43

authored 6 papers about 1 year ago

SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation

Paper • 2305.17011 • Published May 26, 2023

GrootVL: Tree Topology is All You Need in State Space Model

Paper • 2406.02395 • Published Jun 4, 2024 • 1

COVE: Unleashing the Diffusion Feature Correspondence for Consistent Video Editing

Paper • 2406.08850 • Published Jun 13, 2024

HaploVL: A Single-Transformer Baseline for Multi-Modal Understanding

Paper • 2503.14694 • Published Mar 12, 2025

MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO

Paper • 2505.13031 • Published May 19, 2025 • 4

HaploOmni: Unified Single Transformer for Multimodal Video Understanding and Generation

Paper • 2506.02975 • Published Jun 3, 2025

authored a paper about 1 year ago

Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation

Paper • 2505.18875 • Published May 24, 2025 • 42

authored a paper about 1 year ago

H$^{\mathbf{3}}$DP: Triply-Hierarchical Diffusion Policy for Visuomotor Learning

Paper • 2505.07819 • Published May 12, 2025 • 5

authored 3 papers over 1 year ago

Condition-Aware Neural Network for Controlled Image Generation

Paper • 2404.01143 • Published Apr 1, 2024 • 13

Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models

Paper • 2410.10733 • Published Oct 14, 2024 • 9

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers

Paper • 2410.10629 • Published Oct 14, 2024 • 13