LiconStudio commited on
Commit
ecd29ce
·
verified ·
1 Parent(s): 227a06b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +71 -3
README.md CHANGED
@@ -1,3 +1,71 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: wan-ai-license
4
+ license_link: https://github.com/Wan-Video/Wan2.2/blob/main/LICENSE.txt
5
+ base_model: Video-Reason/VBVR-Wan2.2
6
+ library_name: diffusers
7
+ tags:
8
+ - wan2.2
9
+ - i2v
10
+ - fp8
11
+ - comfyui
12
+ - video-generation
13
+ - surgical-quant
14
+ ---
15
+ # Wan2.2-I2V-14B: HiFi-Surgical-FP8 & BF16 (ComfyUI Optimized)
16
+
17
+ This model follows the Wan-AI Software License Agreement. Please refer to the original repository for usage restrictions.
18
+
19
+ This repository provides two high-performance versions of **Wan2.2-I2V-14B**, meticulously optimized for the **ComfyUI** ecosystem. We offer a standard **BF16** version and a specialized **HiFi-Surgical-FP8** mixed-precision version.
20
+
21
+ * **Original Project**: [Video-Reason Wan2.2](https://video-reason.com/)
22
+ * **Original Weights**: [HuggingFace - VBVR-Wan2.2](https://huggingface.co/Video-Reason/VBVR-Wan2.2)
23
+
24
+ ---
25
+
26
+ ## 💎 The HiFi-Surgical Optimization Strategy
27
+
28
+ Unlike generic "one-click" quantization scripts that often cause visual degradation in Wan2.2, our **HiFi-Surgical-FP8** version uses a data-driven, diagnostic-led approach to preserve cinematic quality.
29
+
30
+ ### 1. Layer-Wise SNR Calibration
31
+ We performed a deep medical-grade scan on all 406 linear weight tensors of the FP32 Master. Only layers maintaining an **SNR (Signal-to-Noise Ratio) > 31.5dB** were converted to FP8. This ensures that the mathematical "soul" of the model remains intact.
32
+
33
+ ### 2. High-Outlier Protection
34
+ Wan2.2 weights are notoriously "fragile" with sharp numerical peaks. Our strategy identifies layers with a high **Outlier Index** (Max/Std deviation > 12) and locks them in **BF16**. This specifically targets and eliminates the "sparkle" noise and flickering artifacts common in standard FP8 conversions.
35
+
36
+ ### 3. Structural Integrity (Blocks 30-39)
37
+ We have physically isolated the **Cross-Attention** layers in the final blocks of the DiT architecture. By keeping these critical layers in BF16, we ensure that prompt adherence and temporal consistency are not compromised.
38
+
39
+ ---
40
+
41
+ ## 📊 Comparison & Specs
42
+
43
+ | Feature | Standard BF16 | **HiFi-Surgical-FP8 (Recommended)** |
44
+ | :--- | :--- | :--- |
45
+ | **File Size** | ~27.2 GB | **~22.4 GB** |
46
+ | **Precision** | Pure Bfloat16 | **Hybrid FP8-E4M3 / BF16** |
47
+ | **VRAM Requirement** | 24GB+ | **16GB - 24GB** |
48
+ | **Visual Fidelity** | Reference Grade | **99% Reference Match** |
49
+ | **Inference Speed** | Base Speed | **Accelerated on Blackwell/Hopper** |
50
+
51
+ ---
52
+
53
+ ## 🛠️ ComfyUI Integration & Usage
54
+
55
+ These models are specifically converted and tested for **ComfyUI**.
56
+
57
+ 1. **Native Scaling Support**: We have included the `scale_weight` metadata for every quantized tensor. This allows ComfyUI loaders to utilize hardware-level scaling on **NVIDIA Blackwell (RTX 50-series)** and **Hopper** architectures for maximum speed.
58
+ 2. **How to Use**:
59
+ * Place the `.safetensors` file in your `ComfyUI/models/diffusion_models/.
60
+ * Use the **CheckpointLoaderSimple** or the specialized **UNETLoader**.
61
+ * Ensure your ComfyUI is up-to-date to support the `float8_e4m3fn` type.
62
+
63
+ ---
64
+
65
+ ## 📝 Diagnostic Methodology
66
+
67
+ Each weight in the HiFi version was selected based on the following diagnostic results:
68
+ * **Total Layers Scanned**: 406
69
+ * **FP8 Layers**: 184 (Non-sensitive FFN & Attention layers)
70
+ * **BF16 Layers**: 222 (Sensitive Cross-Attention & Outlier-heavy layers)
71
+ * **Target Hardware**: Optimized for RTX 4090, 5090, and H100/H200.