Instructions to use lightx2v/Wan2.2-NVFP4-Sparse with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use lightx2v/Wan2.2-NVFP4-Sparse with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("lightx2v/Wan2.2-NVFP4-Sparse", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
| license: apache-2.0 | |
| base_model: | |
| - Wan-AI/Wan2.2-T2V-A14B | |
| library_name: diffusers | |
| tags: | |
| - video_generation | |
| - NVFP4 | |
| - Sparse_Attention | |
| - Wan | |
| # π¬ Wan2.2-NVFP4-Sparse | |
| > **An extremely efficient Wan 2.2 14B variant: NVFP4 Quantization-Aware Step Distillation with Sparse Attention for Blackwell Architecture** | |
| [](https://github.com/ModelTC/LightX2V) | |
| [](https://huggingface.co/lightx2v/) | |
| ## π Table of Contents | |
| - [β¨ Features](#-features) | |
| - [π Quick Start](#-quick-start) | |
| - [π¬ Generation Results](#-generation-results) | |
| - [β‘ Performance Comparison](#-performance-comparison) | |
| - [β οΈ Notes](#οΈ-notes) | |
| - [π€ Community](#-community) | |
| ## β¨ Features | |
| - **β‘ 4-Step Inference**: Two high-noise expert steps followed by two low-noise expert steps, enabling extremely fast Wan2.2 MoE generation on a single Blackwell GPU. | |
| - **π― NVFP4 Quantization**: Quantization-aware step distillation reduces memory traffic and compute cost while targeting Blackwell architecture. | |
| - **π§© Sparse Attention**: Accelerates the costly O(nΒ²) self-attention workload with sparse attention, reducing end-to-end latency for high-resolution video generation. | |
| - **π§ LightX2V Integration**: Recommended runtime stack for stable deployment and best performance. | |
| - **π High-Quality Generation**: Preserves the visual quality of Wan2.2-T2V-14B while dramatically improving inference speed. | |
| ## π Quick Start | |
| We strongly recommend using the official LightX2V Docker image for the cleanest environment and best reproducibility. | |
| ### Option A: Docker Recommended | |
| ```bash | |
| # 1. Pull LightX2V Docker image | |
| docker pull lightx2v/lightx2v:26052801-cu130-5090 | |
| # 2. Run inference | |
| bash scripts/wan22/distill/run_wan22_moe_t2v_extreme.sh | |
| ``` | |
| ### Option B: Manual Installation | |
| If Docker is not available, install the environment manually: | |
| ```bash | |
| # 1. Install LightX2V | |
| git clone https://github.com/ModelTC/LightX2V.git | |
| cd LightX2V | |
| uv pip install -v . | |
| # 2. Install NVFP4 Kernel | |
| pip install scikit_build_core uv | |
| git clone https://github.com/NVIDIA/cutlass.git | |
| cd lightx2v_kernel | |
| MAX_JOBS=$(nproc) CMAKE_BUILD_PARALLEL_LEVEL=$(nproc) \ | |
| uv build --wheel \ | |
| -Cbuild-dir=build . \ | |
| -Ccmake.define.CUTLASS_PATH=/path/to/cutlass \ | |
| --verbose --color=always --no-build-isolation | |
| pip install dist/*whl --force-reinstall --no-deps | |
| # 3. Run inference | |
| bash scripts/wan22/distill/run_wan22_moe_t2v_extreme.sh | |
| ``` | |
| Script: [run_wan22_moe_t2v_extreme.sh](https://github.com/ModelTC/LightX2V/blob/main/scripts/wan22/distill/run_wan22_moe_t2v_extreme.sh) | |
| ## π¬ Generation Results | |
| <div style="background: #f8fafc; border: 1px solid #e2e8f0; border-radius: 8px; padding: 16px; margin: 16px 0;"> | |
| <p style="font-style: italic; color: #475569; margin: 0; padding: 12px; background: white; border-radius: 6px; border-left: 4px solid #3b82f6;"> | |
| "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage" | |
| </p> | |
| </div> | |
| | Resolution | Wan2.2-T2V-14B | Wan2.2-NVFP4-Sparse | | |
| | --- | --- | --- | | |
| | 480p | <video controls style="width: 260px; height: 180px; border-radius: 6px; object-fit: cover;" src="https://cdn-uploads.huggingface.co/production/uploads/658e760cccbc1e2cc78b4258/WTHhrzx7XR4S1Ys_6Kzx4.mp4"></video> | <video controls style="width: 260px; height: 180px; border-radius: 6px; object-fit: cover;" src="https://cdn-uploads.huggingface.co/production/uploads/658e760cccbc1e2cc78b4258/zorpw7gm9At0J2kCmvkDr.mp4"></video> | | |
| | 720p | <video controls style="width: 260px; height: 180px; border-radius: 6px; object-fit: cover;" src="https://cdn-uploads.huggingface.co/production/uploads/658e760cccbc1e2cc78b4258/vkiyKj7CJA-r0yTz7TEum.mp4"></video> | <video controls style="width: 260px; height: 180px; border-radius: 6px; object-fit: cover;" src="https://cdn-uploads.huggingface.co/production/uploads/658e760cccbc1e2cc78b4258/TuECbzvW5jI9NHG6GLvIR.mp4"></video> | | |
| ## β‘ Performance Comparison | |
| **Test Environment**: RTX 5090 Single GPU | LightX2V Framework | End-to-End Latency | |
| | Resolution | Wan2.2-T2V-14B | Wan2.2-NVFP4-Sparse | Speedup | | |
| | --- | ---: | ---: | ---: | | |
| | 480p | 734s | 14.15s | 51.9x | | |
| | 720p | 2668s | 45s | 59.3x | | |
| ## β οΈ Notes | |
| ### System Requirements | |
| - **Required Hardware**: NVIDIA RTX 50-series GPUs or other Blackwell architecture GPUs. | |
| - **Recommended Runtime**: `lightx2v/lightx2v:26052801-cu130-5090`. | |
| ### Dependencies | |
| - Prepare Wan2.2 T5 / VAE components following the standard LightX2V Wan2.2 model structure. | |
| - Use Blackwell + NVFP4 kernels for optimal speed and memory efficiency. | |
| ### Performance Tips | |
| - Use the provided extreme inference script for the 4-step high-noise / low-noise expert schedule. | |
| - Sparse attention is most beneficial at higher resolutions where self-attention dominates latency. | |
| - Enable CPU offload only when GPU memory is limited, since offload can reduce throughput. | |
| ## π€ Community | |
| - **π Issues**: [GitHub Issues](https://github.com/ModelTC/LightX2V/issues) | |
| - **π€ Models**: [HuggingFace Hub](https://huggingface.co/lightx2v/) | |
| - **π Documentation**: [LightX2V Docs](https://github.com/ModelTC/LightX2V) | |
| --- | |
| <div align="center"> | |
| **If you find this project helpful, please give us a β on [GitHub](https://github.com/ModelTC/LightX2V)** | |
| For questions or issues, please open an issue on [LightX2V](https://github.com/ModelTC/LightX2V/issues) or contact lvchengtao0319@gmail.com. | |
| </div> | |