Instructions to use lightx2v/Wan2.2-NVFP4-Sparse with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use lightx2v/Wan2.2-NVFP4-Sparse with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("lightx2v/Wan2.2-NVFP4-Sparse", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
File size: 5,547 Bytes
43f194a 4c72cb2 e6d8590 a6f09bc e6d8590 a6f09bc e6d8590 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 | ---
license: apache-2.0
base_model:
- Wan-AI/Wan2.2-T2V-A14B
library_name: diffusers
tags:
- video_generation
- NVFP4
- Sparse_Attention
- Wan
---
# π¬ Wan2.2-NVFP4-Sparse
> **An extremely efficient Wan 2.2 14B variant: NVFP4 Quantization-Aware Step Distillation with Sparse Attention for Blackwell Architecture**
[](https://github.com/ModelTC/LightX2V)
[](https://huggingface.co/lightx2v/)
## π Table of Contents
- [β¨ Features](#-features)
- [π Quick Start](#-quick-start)
- [π¬ Generation Results](#-generation-results)
- [β‘ Performance Comparison](#-performance-comparison)
- [β οΈ Notes](#οΈ-notes)
- [π€ Community](#-community)
## β¨ Features
- **β‘ 4-Step Inference**: Two high-noise expert steps followed by two low-noise expert steps, enabling extremely fast Wan2.2 MoE generation on a single Blackwell GPU.
- **π― NVFP4 Quantization**: Quantization-aware step distillation reduces memory traffic and compute cost while targeting Blackwell architecture.
- **π§© Sparse Attention**: Accelerates the costly O(nΒ²) self-attention workload with sparse attention, reducing end-to-end latency for high-resolution video generation.
- **π§ LightX2V Integration**: Recommended runtime stack for stable deployment and best performance.
- **π High-Quality Generation**: Preserves the visual quality of Wan2.2-T2V-14B while dramatically improving inference speed.
## π Quick Start
We strongly recommend using the official LightX2V Docker image for the cleanest environment and best reproducibility.
### Option A: Docker Recommended
```bash
# 1. Pull LightX2V Docker image
docker pull lightx2v/lightx2v:26052801-cu130-5090
# 2. Run inference
bash scripts/wan22/distill/run_wan22_moe_t2v_extreme.sh
```
### Option B: Manual Installation
If Docker is not available, install the environment manually:
```bash
# 1. Install LightX2V
git clone https://github.com/ModelTC/LightX2V.git
cd LightX2V
uv pip install -v .
# 2. Install NVFP4 Kernel
pip install scikit_build_core uv
git clone https://github.com/NVIDIA/cutlass.git
cd lightx2v_kernel
MAX_JOBS=$(nproc) CMAKE_BUILD_PARALLEL_LEVEL=$(nproc) \
uv build --wheel \
-Cbuild-dir=build . \
-Ccmake.define.CUTLASS_PATH=/path/to/cutlass \
--verbose --color=always --no-build-isolation
pip install dist/*whl --force-reinstall --no-deps
# 3. Run inference
bash scripts/wan22/distill/run_wan22_moe_t2v_extreme.sh
```
Script: [run_wan22_moe_t2v_extreme.sh](https://github.com/ModelTC/LightX2V/blob/main/scripts/wan22/distill/run_wan22_moe_t2v_extreme.sh)
## π¬ Generation Results
<div style="background: #f8fafc; border: 1px solid #e2e8f0; border-radius: 8px; padding: 16px; margin: 16px 0;">
<p style="font-style: italic; color: #475569; margin: 0; padding: 12px; background: white; border-radius: 6px; border-left: 4px solid #3b82f6;">
"Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage"
</p>
</div>
| Resolution | Wan2.2-T2V-14B | Wan2.2-NVFP4-Sparse |
| --- | --- | --- |
| 480p | <video controls style="width: 260px; height: 180px; border-radius: 6px; object-fit: cover;" src="https://cdn-uploads.huggingface.co/production/uploads/658e760cccbc1e2cc78b4258/WTHhrzx7XR4S1Ys_6Kzx4.mp4"></video> | <video controls style="width: 260px; height: 180px; border-radius: 6px; object-fit: cover;" src="https://cdn-uploads.huggingface.co/production/uploads/658e760cccbc1e2cc78b4258/zorpw7gm9At0J2kCmvkDr.mp4"></video> |
| 720p | <video controls style="width: 260px; height: 180px; border-radius: 6px; object-fit: cover;" src="https://cdn-uploads.huggingface.co/production/uploads/658e760cccbc1e2cc78b4258/vkiyKj7CJA-r0yTz7TEum.mp4"></video> | <video controls style="width: 260px; height: 180px; border-radius: 6px; object-fit: cover;" src="https://cdn-uploads.huggingface.co/production/uploads/658e760cccbc1e2cc78b4258/TuECbzvW5jI9NHG6GLvIR.mp4"></video> |
## β‘ Performance Comparison
**Test Environment**: RTX 5090 Single GPU | LightX2V Framework | End-to-End Latency
| Resolution | Wan2.2-T2V-14B | Wan2.2-NVFP4-Sparse | Speedup |
| --- | ---: | ---: | ---: |
| 480p | 734s | 14.15s | 51.9x |
| 720p | 2668s | 45s | 59.3x |
## β οΈ Notes
### System Requirements
- **Required Hardware**: NVIDIA RTX 50-series GPUs or other Blackwell architecture GPUs.
- **Recommended Runtime**: `lightx2v/lightx2v:26052801-cu130-5090`.
### Dependencies
- Prepare Wan2.2 T5 / VAE components following the standard LightX2V Wan2.2 model structure.
- Use Blackwell + NVFP4 kernels for optimal speed and memory efficiency.
### Performance Tips
- Use the provided extreme inference script for the 4-step high-noise / low-noise expert schedule.
- Sparse attention is most beneficial at higher resolutions where self-attention dominates latency.
- Enable CPU offload only when GPU memory is limited, since offload can reduce throughput.
## π€ Community
- **π Issues**: [GitHub Issues](https://github.com/ModelTC/LightX2V/issues)
- **π€ Models**: [HuggingFace Hub](https://huggingface.co/lightx2v/)
- **π Documentation**: [LightX2V Docs](https://github.com/ModelTC/LightX2V)
---
<div align="center">
**If you find this project helpful, please give us a β on [GitHub](https://github.com/ModelTC/LightX2V)**
For questions or issues, please open an issue on [LightX2V](https://github.com/ModelTC/LightX2V/issues) or contact lvchengtao0319@gmail.com.
</div>
|