File size: 5,007 Bytes
603f4b1
7553c4f
df2861f
7553c4f
 
 
 
 
 
 
 
 
 
603f4b1
df2861f
7553c4f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
---
license: apache-2.0
tags:
- diffusion-single-file
- comfyui
- distillation
- NVFP4
- video
- video genration
base_model:
- Wan-AI/Wan2.1-T2V-1.3B
- gdhe17/Self-Forcing
pipeline_tags:
- text-to-video
library_name: diffusers
---
# 🎬 Self-Forcing-NVFP4-4Steps Models

> **NVFP4 Quantization-Aware Step Distillation for Blackwell Architecture**

[![GitHub](https://img.shields.io/badge/GitHub-ModelTC/LightX2V-blue)](https://github.com/ModelTC/LightX2V)
[![HuggingFace](https://img.shields.io/badge/HuggingFace-lightx2v-yellow)](https://huggingface.co/lightx2v/)

## πŸ“‹ Table of Contents

- [✨ Features](#-features)
- [πŸš€ Quick Start](#-quick-start)
- [🎬 Generation Results](#-generation-results)
- [πŸ“¦ Installation](#-installation)
- [πŸ› οΈ Usage](#-usage)
- [🧭 Project Structure](#-project-structure)
- [⚠️ Notes](#️-notes)
- [🀝 Community](#-community)

## ✨ Features

- **⚑ 4-Step Inference**: Dramatically accelerated end-to-end generation approaching real-time performance (tested on RTX 5090 single GPU)
- **🎯 NVFP4 Quantization**: Reduced memory and bandwidth usage, optimized for Blackwell architecture
- **πŸ”§ LightX2V Integration**: Optimal performance and stability on the official framework
- **πŸš€ High-Quality Generation**: Maintains Self-Forcing's superior video quality while achieving unprecedented speed

## πŸš€ Quick Start

```bash
# 1. Install LightX2V
git clone https://github.com/ModelTC/LightX2V.git
cd LightX2V
uv pip install -v .

# 2. Install NVFP4 Kernel
pip install scikit_build_core uv
git clone https://github.com/NVIDIA/cutlass.git
cd lightx2v_kernel

MAX_JOBS=$(nproc) CMAKE_BUILD_PARALLEL_LEVEL=$(nproc) \
uv build --wheel \
  -Cbuild-dir=build . \
  -Ccmake.define.CUTLASS_PATH=/path/to/cutlass \
  --verbose --color=always --no-build-isolation

pip install dist/*whl --force-reinstall --no-deps

# 3. Run inference
# config
https://github.com/ModelTC/LightX2V/blob/main/configs/self_forcing/wan_t2v_sf_nvfp4.json
```

## 🎬 Generation Results

<div style="background: #f8fafc; border: 1px solid #e2e8f0; border-radius: 8px; padding: 16px; margin: 16px 0;">
<p style="font-style: italic; color: #475569; margin: 0; padding: 12px; background: white; border-radius: 6px; border-left: 4px solid #3b82f6;">
"A leprechaun, with green hat and traditional Irish attire, standing in a lush forest filled with vib..."
</p>
</div>

<table style="width: 100%; border-collapse: collapse; margin: 20px 0;">
<tr>
<th style="text-align: center; padding: 12px; background: #f1f5f9; border: 1px solid #e2e8f0; font-weight: 600;">Self-Forcing-1.3B-BF16</th>
<th style="text-align: center; padding: 12px; background: #f1f5f9; border: 1px solid #e2e8f0; font-weight: 600;">Self-Forcing-1.3B-NVFP4</th>
</tr>

<tr>
<td style="text-align: center; padding: 12px; border: 1px solid #e2e8f0;">
<video controls style="width: 260px; height: 180px; border-radius: 6px; object-fit: cover;" src="https://cdn-uploads.huggingface.co/production/uploads/680de13385293771bc57400b/YIoBk3b3CZh0HXSCbDAJB.mp4"></video>
</td>
<td style="text-align: center; padding: 12px; border: 1px solid #e2e8f0;">
<video controls style="width: 260px; height: 180px; border-radius: 6px; object-fit: cover;" src="https://cdn-uploads.huggingface.co/production/uploads/680de13385293771bc57400b/yDYFsVJfHBxVQ541SDxH8.mp4"></video>
</td>
</tr>
</table>

<div style="background: #f8fafc; border: 1px solid #e2e8f0; border-radius: 8px; padding: 16px; margin: 16px 0;">
<p style="font-style: italic; color: #475569; margin: 0; padding: 12px; background: white; border-radius: 6px; border-left: 4px solid #10b981;">
"A mystical and spiritual scene filled with loving energy emanating from the heavens. The sky is bath..."
</p>
</div>

| Self-Forcing-1.3B-BF16 | Self-Forcing-1.3B-NVFP4 |
| --- | --- |
| <video controls style="width: 260px; height: 180px; border-radius: 6px; object-fit: cover;" src="https://cdn-uploads.huggingface.co/production/uploads/680de13385293771bc57400b/Bkbs_Ery2XpQUWp-X6aBX.mp4"></video> | <video controls style="width: 260px; height: 180px; border-radius: 6px; object-fit: cover;" src="https://cdn-uploads.huggingface.co/production/uploads/680de13385293771bc57400b/xFMNI2DBU7h11Inh0Nvn6.mp4"></video> |


## ⚠️ Notes

### System Requirements
- **Required Hardware**: NVIDIA RTX 50-series GPUs (RTX 5090/5080/5070/5060) or other Blackwell architecture GPUs

### Dependencies
- Prepare T5 / CLIP / VAE components yourself (same as Self-Forcing structure)

### Performance Tips
- Use Blackwell + NVFP4 for best performance
- Enable CPU offload for GPUs with limited memory

## 🀝 Community

- **πŸ› Issues**: [GitHub Issues](https://github.com/ModelTC/LightX2V/issues)
- **πŸ€— Models**: [HuggingFace Hub](https://huggingface.co/lightx2v/)
- **πŸ“– Documentation**: [LightX2V Docs](https://github.com/ModelTC/LightX2V)

---

<div align="center">

**If you find this project helpful, please give us a ⭐ on [GitHub](https://github.com/ModelTC/LightX2V)**

</div>