Wan2.2 NVFP4 Sparse to ComfyUI Conversion Analysis

Sources checked

Kijai Hugging Face repo: https://huggingface.co/Kijai/WanVideo_comfy_nvfp4
ComfyUI Wan2.2 workflow docs: https://docs.comfy.org/tutorials/video/wan/wan2_2
ComfyUI Wan2.2 examples: https://comfyanonymous.github.io/ComfyUI_examples/wan22/
ComfyUI mixed precision loader reference: https://huggingface.co/mhnakif/comfy/blob/main/comfy/ops.py
ComfyUI quant op reference: https://huggingface.co/mhnakif/comfy/blob/main/comfy/quant_ops.py
Comfy Kitchen hardware/backend reference: https://github.com/Comfy-Org/comfy-kitchen
Local ComfyUI source checkout used for verification: Comfy-Org/ComfyUI commit 5aa71b9bc28809a16596bb9fa3d0a6300d8e3f0e

Script provenance

convert_lightx2v_nvfp4_to_comfy.py is a local conversion script written for this directory. It is not copied from one upstream script. The implementation is derived from these upstream pages and ComfyUI source conventions:

Kijai model page: https://huggingface.co/Kijai/WanVideo_comfy_nvfp4
- This page gives the actual LightX2V NVFP4 to Comfy NVFP4 conversion rule: nibble-swap packed U8 weights, keep weight_scale, set weight_scale_2 = alpha * input_global_scale, and set input_scale = 1 / input_global_scale.
ComfyUI quantized loader: https://github.com/Comfy-Org/ComfyUI/blob/5aa71b9bc28809a16596bb9fa3d0a6300d8e3f0e/comfy/ops.py#L1058-L1091
- This loader reads {layer}.comfy_quant, branches on format == "nvfp4", then requires {layer}.weight_scale_2 and {layer}.weight_scale.
ComfyUI quant algorithm registry: https://github.com/Comfy-Org/ComfyUI/blob/5aa71b9bc28809a16596bb9fa3d0a6300d8e3f0e/comfy/quant_ops.py#L190-L205
- This defines the nvfp4 storage dtype as torch.uint8 and the parameter set as weight_scale, weight_scale_2, and input_scale.
ComfyUI quantization metadata handling: https://github.com/Comfy-Org/ComfyUI/blob/5aa71b9bc28809a16596bb9fa3d0a6300d8e3f0e/comfy/utils.py#L1360-L1421
- This shows that _quantization_metadata.layers is converted into {layer}.comfy_quant JSON byte tensors and that the presence of .comfy_quant enables mixed quantized ops.
ComfyUI native NVFP4 hardware gate: https://github.com/Comfy-Org/ComfyUI/blob/5aa71b9bc28809a16596bb9fa3d0a6300d8e3f0e/comfy/model_management.py#L1877-L1885
- This returns true only for NVIDIA GPUs with compute capability major version >= 10, which is why H100 can validate/load files but is not expected to use native Blackwell NVFP4 tensor-core compute.

Format findings

The original files in this directory are LightX2V NVFP4 Sparse safetensors. Each file has 400 quantized Linear layers with these LightX2V-side tensors:

{layer}.weight: packed NVFP4 values in torch.uint8
{layer}.weight_scale: FP8 E4M3 block scale tensor
{layer}.alpha: scalar post-matmul rescaler
{layer}.input_global_scale: scalar input scale convention

Kijai's model card says the ComfyUI conversion is still NVFP4 and uses the same datatype, but changes conventions:

Swap the high/low nibbles in each packed uint8 weight byte.
Keep {layer}.weight_scale as-is.
Convert {layer}.alpha * {layer}.input_global_scale into {layer}.weight_scale_2.
Convert 1 / {layer}.input_global_scale into {layer}.input_scale.

ComfyUI's mixed precision loader expects a {layer}.comfy_quant tensor containing JSON bytes. For NVFP4 it then loads:

{layer}.weight
{layer}.weight_scale_2
{layer}.weight_scale
optional registered parameters such as {layer}.input_scale

The converted file metadata also includes _quantization_metadata with one nvfp4 layer entry per quantized layer so ComfyUI can select mixed precision operations for the model.

H100 note

The conversion itself does not require a Blackwell GPU; it is a safetensors layout conversion. However, Comfy Kitchen documents TensorCoreNVFP4Layout as requiring SM >= 10.0 / Blackwell for native NVFP4 tensor-core acceleration. H100 is Hopper, so ComfyUI may disable native NVFP4 compute and run a fallback path.

Script

The conversion script is:

python convert_lightx2v_nvfp4_to_comfy.py

Useful options:

python convert_lightx2v_nvfp4_to_comfy.py --dry-run
python convert_lightx2v_nvfp4_to_comfy.py --overwrite
python convert_lightx2v_nvfp4_to_comfy.py input.safetensors --output-dir /path/to/out

The script writes <original_stem>_comfy.safetensors and uses a temporary file before renaming into place.

Converted outputs

Wan2.2-I2V-A14B_NVFP4_Sparse_high_comfy.safetensors
Wan2.2-I2V-A14B_NVFP4_Sparse_low_comfy.safetensors
Wan2.2-T2V-A14B_NVFP4_Sparse_high_comfy.safetensors
Wan2.2-T2V-A14B_NVFP4_Sparse_low_comfy.safetensors

For ComfyUI native workflows, place these diffusion model files under:

ComfyUI/models/diffusion_models/

The Wan2.2 14B workflows still need the normal text encoder and VAE files in their ComfyUI locations.

Verification performed

For each converted file:

Tensor count is 2695.
_quantization_metadata contains 400 quantized layers.
alpha count is 0.
input_global_scale count is 0.
input_scale count is 400.
weight_scale count is 400.
weight_scale_2 count is 400.
comfy_quant count is 400.
{layer}.comfy_quant decodes to {"format": "nvfp4"}.
A sampled blocks.0.cross_attn.k.weight block equals the expected nibble swap from the original.
The sampled weight_scale_2 equals alpha * input_global_scale.
The sampled input_scale equals 1 / input_global_scale.