Wan2.2-NVFP4-Sparse / analysis.md
charles2530's picture
Add files using upload-large-folder tool
e2634b7 verified

Wan2.2 NVFP4 Sparse to ComfyUI Conversion Analysis

Sources checked

Script provenance

convert_lightx2v_nvfp4_to_comfy.py is a local conversion script written for this directory. It is not copied from one upstream script. The implementation is derived from these upstream pages and ComfyUI source conventions:

Format findings

The original files in this directory are LightX2V NVFP4 Sparse safetensors. Each file has 400 quantized Linear layers with these LightX2V-side tensors:

  • {layer}.weight: packed NVFP4 values in torch.uint8
  • {layer}.weight_scale: FP8 E4M3 block scale tensor
  • {layer}.alpha: scalar post-matmul rescaler
  • {layer}.input_global_scale: scalar input scale convention

Kijai's model card says the ComfyUI conversion is still NVFP4 and uses the same datatype, but changes conventions:

  • Swap the high/low nibbles in each packed uint8 weight byte.
  • Keep {layer}.weight_scale as-is.
  • Convert {layer}.alpha * {layer}.input_global_scale into {layer}.weight_scale_2.
  • Convert 1 / {layer}.input_global_scale into {layer}.input_scale.

ComfyUI's mixed precision loader expects a {layer}.comfy_quant tensor containing JSON bytes. For NVFP4 it then loads:

  • {layer}.weight
  • {layer}.weight_scale_2
  • {layer}.weight_scale
  • optional registered parameters such as {layer}.input_scale

The converted file metadata also includes _quantization_metadata with one nvfp4 layer entry per quantized layer so ComfyUI can select mixed precision operations for the model.

H100 note

The conversion itself does not require a Blackwell GPU; it is a safetensors layout conversion. However, Comfy Kitchen documents TensorCoreNVFP4Layout as requiring SM >= 10.0 / Blackwell for native NVFP4 tensor-core acceleration. H100 is Hopper, so ComfyUI may disable native NVFP4 compute and run a fallback path.

Script

The conversion script is:

python convert_lightx2v_nvfp4_to_comfy.py

Useful options:

python convert_lightx2v_nvfp4_to_comfy.py --dry-run
python convert_lightx2v_nvfp4_to_comfy.py --overwrite
python convert_lightx2v_nvfp4_to_comfy.py input.safetensors --output-dir /path/to/out

The script writes <original_stem>_comfy.safetensors and uses a temporary file before renaming into place.

Converted outputs

  • Wan2.2-I2V-A14B_NVFP4_Sparse_high_comfy.safetensors
  • Wan2.2-I2V-A14B_NVFP4_Sparse_low_comfy.safetensors
  • Wan2.2-T2V-A14B_NVFP4_Sparse_high_comfy.safetensors
  • Wan2.2-T2V-A14B_NVFP4_Sparse_low_comfy.safetensors

For ComfyUI native workflows, place these diffusion model files under:

ComfyUI/models/diffusion_models/

The Wan2.2 14B workflows still need the normal text encoder and VAE files in their ComfyUI locations.

Verification performed

For each converted file:

  • Tensor count is 2695.
  • _quantization_metadata contains 400 quantized layers.
  • alpha count is 0.
  • input_global_scale count is 0.
  • input_scale count is 400.
  • weight_scale count is 400.
  • weight_scale_2 count is 400.
  • comfy_quant count is 400.
  • {layer}.comfy_quant decodes to {"format": "nvfp4"}.
  • A sampled blocks.0.cross_attn.k.weight block equals the expected nibble swap from the original.
  • The sampled weight_scale_2 equals alpha * input_global_scale.
  • The sampled input_scale equals 1 / input_global_scale.