File size: 3,418 Bytes
ecd29ce
0a961dd
ecd29ce
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
---
license: apache-2.0
license_name: wan-ai-license
license_link: https://github.com/Wan-Video/Wan2.2/blob/main/LICENSE.txt
base_model: Video-Reason/VBVR-Wan2.2
library_name: diffusers
tags:
- wan2.2
- i2v
- fp8
- comfyui
- video-generation
- surgical-quant
---
# Wan2.2-I2V-14B: HiFi-Surgical-FP8 & BF16 (ComfyUI Optimized)

This model follows the Wan-AI Software License Agreement. Please refer to the original repository for usage restrictions.

This repository provides two high-performance versions of **Wan2.2-I2V-14B**, meticulously optimized for the **ComfyUI** ecosystem. We offer a standard **BF16** version and a specialized **HiFi-Surgical-FP8** mixed-precision version.

* **Original Project**: [Video-Reason Wan2.2](https://video-reason.com/)
* **Original Weights**: [HuggingFace - VBVR-Wan2.2](https://huggingface.co/Video-Reason/VBVR-Wan2.2)

---

## ๐Ÿ’Ž The HiFi-Surgical Optimization Strategy

Unlike generic "one-click" quantization scripts that often cause visual degradation in Wan2.2, our **HiFi-Surgical-FP8** version uses a data-driven, diagnostic-led approach to preserve cinematic quality.

### 1. Layer-Wise SNR Calibration
We performed a deep medical-grade scan on all 406 linear weight tensors of the FP32 Master. Only layers maintaining an **SNR (Signal-to-Noise Ratio) > 31.5dB** were converted to FP8. This ensures that the mathematical "soul" of the model remains intact.

### 2. High-Outlier Protection
Wan2.2 weights are notoriously "fragile" with sharp numerical peaks. Our strategy identifies layers with a high **Outlier Index** (Max/Std deviation > 12) and locks them in **BF16**. This specifically targets and eliminates the "sparkle" noise and flickering artifacts common in standard FP8 conversions.

### 3. Structural Integrity (Blocks 30-39)
We have physically isolated the **Cross-Attention** layers in the final blocks of the DiT architecture. By keeping these critical layers in BF16, we ensure that prompt adherence and temporal consistency are not compromised.

---

## ๐Ÿ“Š Comparison & Specs

| Feature | Standard BF16 | **HiFi-Surgical-FP8 (Recommended)** |
| :--- | :--- | :--- |
| **File Size** | ~27.2 GB | **~22.4 GB** |
| **Precision** | Pure Bfloat16 | **Hybrid FP8-E4M3 / BF16** |
| **VRAM Requirement** | 24GB+ | **16GB - 24GB** |
| **Visual Fidelity** | Reference Grade | **99% Reference Match** |
| **Inference Speed** | Base Speed | **Accelerated on Blackwell/Hopper** |

---

## ๐Ÿ› ๏ธ ComfyUI Integration & Usage

These models are specifically converted and tested for **ComfyUI**.

1.  **Native Scaling Support**: We have included the `scale_weight` metadata for every quantized tensor. This allows ComfyUI loaders to utilize hardware-level scaling on **NVIDIA Blackwell (RTX 50-series)** and **Hopper** architectures for maximum speed.
2.  **How to Use**:
    * Place the `.safetensors` file in your `ComfyUI/models/diffusion_models/.
    * Use the **CheckpointLoaderSimple** or the specialized **UNETLoader**.
    * Ensure your ComfyUI is up-to-date to support the `float8_e4m3fn` type.

---

## ๐Ÿ“ Diagnostic Methodology

Each weight in the HiFi version was selected based on the following diagnostic results:
* **Total Layers Scanned**: 406
* **FP8 Layers**: 184 (Non-sensitive FFN & Attention layers)
* **BF16 Layers**: 222 (Sensitive Cross-Attention & Outlier-heavy layers)
* **Target Hardware**: Optimized for RTX 4090, 5090, and H100/H200.