Add model card and metadata

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +64 -0
README.md ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: text-to-video
4
+ tags:
5
+ - video-generation
6
+ - sparse-attention
7
+ - autoregressive
8
+ ---
9
+
10
+ # Light Forcing: Accelerating Autoregressive Video Diffusion via Sparse Attention
11
+
12
+ This repository contains the model checkpoints for **Light Forcing**, the first sparse attention solution specifically tailored for autoregressive (AR) video generation models.
13
+
14
+ [**Paper**](https://huggingface.co/papers/2602.04789) | [**GitHub Code**](https://github.com/chengtao-lv/LightForcing)
15
+
16
+ ## Introduction
17
+ Advanced autoregressive video generation models often suffer from the quadratic complexity of attention. Light Forcing addresses this bottleneck with two key innovations:
18
+ 1. **Chunk-Aware Growth:** A mechanism to quantitatively estimate the contribution of each chunk, determining their sparsity allocation.
19
+ 2. **Hierarchical Sparse Attention:** A strategy to capture historical and local context in a coarse-to-fine manner.
20
+
21
+ The method achieves a **1.2x–1.3x** end-to-end speedup while maintaining high visual quality. When combined with FP8 quantization and LightVAE, it can achieve up to a **3.0x** speedup on hardware like the RTX 5090.
22
+
23
+ ## Usage
24
+
25
+ For the full environment setup, the authors recommend using the provided Docker image:
26
+ ```bash
27
+ docker pull lvchengtao/light_forcing:v1
28
+ ```
29
+
30
+ ### Fast Inference
31
+
32
+ After setting up the environment and downloading the necessary checkpoints (e.g., Wan2.1-T2V-1.3B), you can run inference using the scripts provided in the repository.
33
+
34
+ **For short-video generation (e.g., 5s):**
35
+ ```bash
36
+ python inference.py \
37
+ --config_path configs/light_forcing_short.yaml \
38
+ --output_folder videos/light_forcing_short \
39
+ --checkpoint_path path/to/short_video_gen.pt \
40
+ --data_path prompts/MovieGenVideoBench_extended.txt \
41
+ --use_ema
42
+ ```
43
+
44
+ **For long-video generation (e.g., 15s):**
45
+ ```bash
46
+ python inference.py \
47
+ --config_path configs/light_forcing_long.yaml \
48
+ --output_folder videos/light_forcing_long \
49
+ --checkpoint_path path/to/long_video_gen.pt \
50
+ --data_path prompts/MovieGenVideoBench_extended.txt \
51
+ --use_ema \
52
+ --num_output_frames 63
53
+ ```
54
+
55
+ ## Citation
56
+ If you find this work or the code useful, please cite:
57
+ ```bibtex
58
+ @article{lv2026light,
59
+ title={Light Forcing: Accelerating Autoregressive Video Diffusion via Sparse Attention},
60
+ author={Lv, Chengtao and Shi, Yumeng and Huang, Yushi and Gong, Ruihao and Ren, Shen and Wang, Wenya},
61
+ journal={arXiv preprint arXiv:2602.04789},
62
+ year={2026}
63
+ }
64
+ ```