mack-williams
/

Light-Forcing

video_generation

Sparse_Attention

Model card Files Files and versions

Add model card and metadata

#1

by nielsr HF Staff - opened about 1 month ago

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +64 -0

README.md ADDED Viewed

	@@ -0,0 +1,64 @@

+---
+license: apache-2.0
+pipeline_tag: text-to-video
+tags:
+- video-generation
+- sparse-attention
+- autoregressive
+---
+# Light Forcing: Accelerating Autoregressive Video Diffusion via Sparse Attention
+This repository contains the model checkpoints for **Light Forcing**, the first sparse attention solution specifically tailored for autoregressive (AR) video generation models.
+[**Paper**](https://huggingface.co/papers/2602.04789) | [**GitHub Code**](https://github.com/chengtao-lv/LightForcing)
+## Introduction
+Advanced autoregressive video generation models often suffer from the quadratic complexity of attention. Light Forcing addresses this bottleneck with two key innovations:
+1. **Chunk-Aware Growth:** A mechanism to quantitatively estimate the contribution of each chunk, determining their sparsity allocation.
+2. **Hierarchical Sparse Attention:** A strategy to capture historical and local context in a coarse-to-fine manner.
+The method achieves a **1.2x–1.3x** end-to-end speedup while maintaining high visual quality. When combined with FP8 quantization and LightVAE, it can achieve up to a **3.0x** speedup on hardware like the RTX 5090.
+## Usage
+For the full environment setup, the authors recommend using the provided Docker image:
+```bash
+docker pull lvchengtao/light_forcing:v1
+```
+### Fast Inference
+After setting up the environment and downloading the necessary checkpoints (e.g., Wan2.1-T2V-1.3B), you can run inference using the scripts provided in the repository.
+**For short-video generation (e.g., 5s):**
+```bash
+python inference.py \
+  --config_path configs/light_forcing_short.yaml \
+  --output_folder videos/light_forcing_short \
+  --checkpoint_path path/to/short_video_gen.pt \
+  --data_path prompts/MovieGenVideoBench_extended.txt \
+  --use_ema
+```
+**For long-video generation (e.g., 15s):**
+```bash
+python inference.py \
+  --config_path configs/light_forcing_long.yaml \
+  --output_folder videos/light_forcing_long \
+  --checkpoint_path path/to/long_video_gen.pt \
+  --data_path prompts/MovieGenVideoBench_extended.txt \
+  --use_ema \
+  --num_output_frames 63
+```
+## Citation
+If you find this work or the code useful, please cite:
+```bibtex
+@article{lv2026light,
+  title={Light Forcing: Accelerating Autoregressive Video Diffusion via Sparse Attention},
+  author={Lv, Chengtao and Shi, Yumeng and Huang, Yushi and Gong, Ruihao and Ren, Shen and Wang, Wenya},
+  journal={arXiv preprint arXiv:2602.04789},
+  year={2026}
+}
+```