# Wan2.2 NVFP4 Sparse to ComfyUI Conversion Analysis ## Sources checked - Kijai Hugging Face repo: https://huggingface.co/Kijai/WanVideo_comfy_nvfp4 - ComfyUI Wan2.2 workflow docs: https://docs.comfy.org/tutorials/video/wan/wan2_2 - ComfyUI Wan2.2 examples: https://comfyanonymous.github.io/ComfyUI_examples/wan22/ - ComfyUI mixed precision loader reference: https://huggingface.co/mhnakif/comfy/blob/main/comfy/ops.py - ComfyUI quant op reference: https://huggingface.co/mhnakif/comfy/blob/main/comfy/quant_ops.py - Comfy Kitchen hardware/backend reference: https://github.com/Comfy-Org/comfy-kitchen - Local ComfyUI source checkout used for verification: `Comfy-Org/ComfyUI` commit `5aa71b9bc28809a16596bb9fa3d0a6300d8e3f0e` ## Script provenance `convert_lightx2v_nvfp4_to_comfy.py` is a local conversion script written for this directory. It is not copied from one upstream script. The implementation is derived from these upstream pages and ComfyUI source conventions: - Kijai model page: https://huggingface.co/Kijai/WanVideo_comfy_nvfp4 - This page gives the actual LightX2V NVFP4 to Comfy NVFP4 conversion rule: nibble-swap packed U8 weights, keep `weight_scale`, set `weight_scale_2 = alpha * input_global_scale`, and set `input_scale = 1 / input_global_scale`. - ComfyUI quantized loader: https://github.com/Comfy-Org/ComfyUI/blob/5aa71b9bc28809a16596bb9fa3d0a6300d8e3f0e/comfy/ops.py#L1058-L1091 - This loader reads `{layer}.comfy_quant`, branches on `format == "nvfp4"`, then requires `{layer}.weight_scale_2` and `{layer}.weight_scale`. - ComfyUI quant algorithm registry: https://github.com/Comfy-Org/ComfyUI/blob/5aa71b9bc28809a16596bb9fa3d0a6300d8e3f0e/comfy/quant_ops.py#L190-L205 - This defines the `nvfp4` storage dtype as `torch.uint8` and the parameter set as `weight_scale`, `weight_scale_2`, and `input_scale`. - ComfyUI quantization metadata handling: https://github.com/Comfy-Org/ComfyUI/blob/5aa71b9bc28809a16596bb9fa3d0a6300d8e3f0e/comfy/utils.py#L1360-L1421 - This shows that `_quantization_metadata.layers` is converted into `{layer}.comfy_quant` JSON byte tensors and that the presence of `.comfy_quant` enables mixed quantized ops. - ComfyUI native NVFP4 hardware gate: https://github.com/Comfy-Org/ComfyUI/blob/5aa71b9bc28809a16596bb9fa3d0a6300d8e3f0e/comfy/model_management.py#L1877-L1885 - This returns true only for NVIDIA GPUs with compute capability major version >= 10, which is why H100 can validate/load files but is not expected to use native Blackwell NVFP4 tensor-core compute. ## Format findings The original files in this directory are LightX2V NVFP4 Sparse safetensors. Each file has 400 quantized Linear layers with these LightX2V-side tensors: - `{layer}.weight`: packed NVFP4 values in `torch.uint8` - `{layer}.weight_scale`: FP8 E4M3 block scale tensor - `{layer}.alpha`: scalar post-matmul rescaler - `{layer}.input_global_scale`: scalar input scale convention Kijai's model card says the ComfyUI conversion is still NVFP4 and uses the same datatype, but changes conventions: - Swap the high/low nibbles in each packed `uint8` weight byte. - Keep `{layer}.weight_scale` as-is. - Convert `{layer}.alpha * {layer}.input_global_scale` into `{layer}.weight_scale_2`. - Convert `1 / {layer}.input_global_scale` into `{layer}.input_scale`. ComfyUI's mixed precision loader expects a `{layer}.comfy_quant` tensor containing JSON bytes. For NVFP4 it then loads: - `{layer}.weight` - `{layer}.weight_scale_2` - `{layer}.weight_scale` - optional registered parameters such as `{layer}.input_scale` The converted file metadata also includes `_quantization_metadata` with one `nvfp4` layer entry per quantized layer so ComfyUI can select mixed precision operations for the model. ## H100 note The conversion itself does not require a Blackwell GPU; it is a safetensors layout conversion. However, Comfy Kitchen documents `TensorCoreNVFP4Layout` as requiring SM >= 10.0 / Blackwell for native NVFP4 tensor-core acceleration. H100 is Hopper, so ComfyUI may disable native NVFP4 compute and run a fallback path. ## Script The conversion script is: ```bash python convert_lightx2v_nvfp4_to_comfy.py ``` Useful options: ```bash python convert_lightx2v_nvfp4_to_comfy.py --dry-run python convert_lightx2v_nvfp4_to_comfy.py --overwrite python convert_lightx2v_nvfp4_to_comfy.py input.safetensors --output-dir /path/to/out ``` The script writes `_comfy.safetensors` and uses a temporary file before renaming into place. ## Converted outputs - `Wan2.2-I2V-A14B_NVFP4_Sparse_high_comfy.safetensors` - `Wan2.2-I2V-A14B_NVFP4_Sparse_low_comfy.safetensors` - `Wan2.2-T2V-A14B_NVFP4_Sparse_high_comfy.safetensors` - `Wan2.2-T2V-A14B_NVFP4_Sparse_low_comfy.safetensors` For ComfyUI native workflows, place these diffusion model files under: ```text ComfyUI/models/diffusion_models/ ``` The Wan2.2 14B workflows still need the normal text encoder and VAE files in their ComfyUI locations. ## Verification performed For each converted file: - Tensor count is 2695. - `_quantization_metadata` contains 400 quantized layers. - `alpha` count is 0. - `input_global_scale` count is 0. - `input_scale` count is 400. - `weight_scale` count is 400. - `weight_scale_2` count is 400. - `comfy_quant` count is 400. - `{layer}.comfy_quant` decodes to `{"format": "nvfp4"}`. - A sampled `blocks.0.cross_attn.k.weight` block equals the expected nibble swap from the original. - The sampled `weight_scale_2` equals `alpha * input_global_scale`. - The sampled `input_scale` equals `1 / input_global_scale`.