metadata
license: apache-2.0
tags:
- diffusion-single-file
- comfyui
- distillation
- NVFP4
- video
- video genration
base_model:
- Wan-AI/Wan2.1-I2V-14B-480P
- Wan-AI/Wan2.1-T2V-1.3B
pipeline_tags:
- image-to-video
- text-to-video
library_name: diffusers
🎬 Wan-NVFP4-4Steps Models
NVFP4 Quantization-Aware Step Distillation for Blackwell Architecture
📋 Table of Contents
- ✨ Features
- 🚀 Quick Start
- 🎬 Generation Results
- ⚡ Performance Comparison
- 📦 Installation
- 🛠️ Usage
- 🧭 Project Structure
- ⚠️ Notes
- 🤝 Community
✨ Features
- ⚡ 4-Step Inference: Dramatically accelerated end-to-end generation approaching real-time performance (tested on RTX 5090 single GPU)
- 🎯 NVFP4 Quantization: Reduced memory and bandwidth usage, optimized for Blackwell architecture
- 🔧 LightX2V Integration: Optimal performance and stability on the official framework
- 🚀 High-Quality Generation: Maintains Wan2.1's superior video quality while achieving unprecedented speed
🚀 Quick Start
# 1. Install LightX2V
git clone https://github.com/ModelTC/LightX2V.git
cd LightX2V
uv pip install -v .
# 2. Install NVFP4 Kernel
pip install scikit_build_core uv
git clone https://github.com/NVIDIA/cutlass.git
cd lightx2v_kernel
MAX_JOBS=$(nproc) CMAKE_BUILD_PARALLEL_LEVEL=$(nproc) \
uv build --wheel \
-Cbuild-dir=build . \
-Ccmake.define.CUTLASS_PATH=/path/to/cutlass \
--verbose --color=always --no-build-isolation
pip install dist/*whl --force-reinstall --no-deps
# 3. Run inference
cd examples/wan
python wan_i2v_nvfp4.py # Image-to-Video
python wan_t2v_nvfp4.py # Text-to-Video
🎬 Generation Results
"A cinematic, hyper-realistic 3D animation, in the somber and beautiful style of Sekiro: Shadows Die Twice. In a vast field of silvery-white pampas grass, under a luminous full moon, the shinobi Wolf stands ready for a final duel..."
| Input Image | Wan2.1-I2V-14B-480P | wan2.1_i2v_480p_nvfp4_lightx2v_4step |
|---|---|---|
|
"高对比度,高饱和度,短边构图,日落,中焦距,柔光,背光,暖色调,边缘光,中近景,日光,晴天光,一位外国白人女性的近景,她身穿黄色格子连衣裙,戴着耳环。随着仰拍镜头的上升,女子抬起头来,眼睛里含着泪水,看着前方说着话..."
| Wan2.1-T2V-1.3B | wan2.1_t2v_1_3b_nvfp4_lightx2v_4step |
|---|---|
⚡ Performance Comparison
Test Environment: RTX 5090 Single GPU | LightX2V Framework
📸 Image-to-Video (I2V-14B-480P)
|
🎬 Text-to-Video (T2V-1.3B-480P)
|
⚠️ Notes
System Requirements
- Required Hardware: NVIDIA RTX 50-series GPUs (RTX 5090/5080/5070/5060) or other Blackwell architecture GPUs
Dependencies
- Prepare T5 / CLIP / VAE components yourself (same as Wan2.x structure)
Performance Tips
- Use Blackwell + NVFP4 for best performance
- Enable CPU offload for GPUs with limited memory
🤝 Community
- 🐛 Issues: GitHub Issues
- 🤗 Models: HuggingFace Hub
- 📖 Documentation: LightX2V Docs
If you find this project helpful, please give us a ⭐ on GitHub