# Wan2.2 NVFP4 Sparse to ComfyUI Conversion Analysis

## Sources checked

- Kijai Hugging Face repo: https://huggingface.co/Kijai/WanVideo_comfy_nvfp4
- ComfyUI Wan2.2 workflow docs: https://docs.comfy.org/tutorials/video/wan/wan2_2
- ComfyUI Wan2.2 examples: https://comfyanonymous.github.io/ComfyUI_examples/wan22/
- ComfyUI mixed precision loader reference:
  https://huggingface.co/mhnakif/comfy/blob/main/comfy/ops.py
- ComfyUI quant op reference:
  https://huggingface.co/mhnakif/comfy/blob/main/comfy/quant_ops.py
- Comfy Kitchen hardware/backend reference:
  https://github.com/Comfy-Org/comfy-kitchen
- Local ComfyUI source checkout used for verification:
  `Comfy-Org/ComfyUI` commit
  `5aa71b9bc28809a16596bb9fa3d0a6300d8e3f0e`

## Script provenance

`convert_lightx2v_nvfp4_to_comfy.py` is a local conversion script written for
this directory. It is not copied from one upstream script. The implementation is
derived from these upstream pages and ComfyUI source conventions:

- Kijai model page:
  https://huggingface.co/Kijai/WanVideo_comfy_nvfp4
  - This page gives the actual LightX2V NVFP4 to Comfy NVFP4 conversion rule:
    nibble-swap packed U8 weights, keep `weight_scale`, set
    `weight_scale_2 = alpha * input_global_scale`, and set
    `input_scale = 1 / input_global_scale`.
- ComfyUI quantized loader:
  https://github.com/Comfy-Org/ComfyUI/blob/5aa71b9bc28809a16596bb9fa3d0a6300d8e3f0e/comfy/ops.py#L1058-L1091
  - This loader reads `{layer}.comfy_quant`, branches on `format == "nvfp4"`,
    then requires `{layer}.weight_scale_2` and `{layer}.weight_scale`.
- ComfyUI quant algorithm registry:
  https://github.com/Comfy-Org/ComfyUI/blob/5aa71b9bc28809a16596bb9fa3d0a6300d8e3f0e/comfy/quant_ops.py#L190-L205
  - This defines the `nvfp4` storage dtype as `torch.uint8` and the parameter
    set as `weight_scale`, `weight_scale_2`, and `input_scale`.
- ComfyUI quantization metadata handling:
  https://github.com/Comfy-Org/ComfyUI/blob/5aa71b9bc28809a16596bb9fa3d0a6300d8e3f0e/comfy/utils.py#L1360-L1421
  - This shows that `_quantization_metadata.layers` is converted into
    `{layer}.comfy_quant` JSON byte tensors and that the presence of
    `.comfy_quant` enables mixed quantized ops.
- ComfyUI native NVFP4 hardware gate:
  https://github.com/Comfy-Org/ComfyUI/blob/5aa71b9bc28809a16596bb9fa3d0a6300d8e3f0e/comfy/model_management.py#L1877-L1885
  - This returns true only for NVIDIA GPUs with compute capability major
    version >= 10, which is why H100 can validate/load files but is not expected
    to use native Blackwell NVFP4 tensor-core compute.

## Format findings

The original files in this directory are LightX2V NVFP4 Sparse safetensors. Each
file has 400 quantized Linear layers with these LightX2V-side tensors:

- `{layer}.weight`: packed NVFP4 values in `torch.uint8`
- `{layer}.weight_scale`: FP8 E4M3 block scale tensor
- `{layer}.alpha`: scalar post-matmul rescaler
- `{layer}.input_global_scale`: scalar input scale convention

Kijai's model card says the ComfyUI conversion is still NVFP4 and uses the same
datatype, but changes conventions:

- Swap the high/low nibbles in each packed `uint8` weight byte.
- Keep `{layer}.weight_scale` as-is.
- Convert `{layer}.alpha * {layer}.input_global_scale` into
  `{layer}.weight_scale_2`.
- Convert `1 / {layer}.input_global_scale` into `{layer}.input_scale`.

ComfyUI's mixed precision loader expects a `{layer}.comfy_quant` tensor
containing JSON bytes. For NVFP4 it then loads:

- `{layer}.weight`
- `{layer}.weight_scale_2`
- `{layer}.weight_scale`
- optional registered parameters such as `{layer}.input_scale`

The converted file metadata also includes `_quantization_metadata` with one
`nvfp4` layer entry per quantized layer so ComfyUI can select mixed precision
operations for the model.

## H100 note

The conversion itself does not require a Blackwell GPU; it is a safetensors
layout conversion. However, Comfy Kitchen documents `TensorCoreNVFP4Layout` as
requiring SM >= 10.0 / Blackwell for native NVFP4 tensor-core acceleration. H100
is Hopper, so ComfyUI may disable native NVFP4 compute and run a fallback path.

## Script

The conversion script is:

```bash
python convert_lightx2v_nvfp4_to_comfy.py
```

Useful options:

```bash
python convert_lightx2v_nvfp4_to_comfy.py --dry-run
python convert_lightx2v_nvfp4_to_comfy.py --overwrite
python convert_lightx2v_nvfp4_to_comfy.py input.safetensors --output-dir /path/to/out
```

The script writes `<original_stem>_comfy.safetensors` and uses a temporary file
before renaming into place.

## Converted outputs

- `Wan2.2-I2V-A14B_NVFP4_Sparse_high_comfy.safetensors`
- `Wan2.2-I2V-A14B_NVFP4_Sparse_low_comfy.safetensors`
- `Wan2.2-T2V-A14B_NVFP4_Sparse_high_comfy.safetensors`
- `Wan2.2-T2V-A14B_NVFP4_Sparse_low_comfy.safetensors`

For ComfyUI native workflows, place these diffusion model files under:

```text
ComfyUI/models/diffusion_models/
```

The Wan2.2 14B workflows still need the normal text encoder and VAE files in
their ComfyUI locations.

## Verification performed

For each converted file:

- Tensor count is 2695.
- `_quantization_metadata` contains 400 quantized layers.
- `alpha` count is 0.
- `input_global_scale` count is 0.
- `input_scale` count is 400.
- `weight_scale` count is 400.
- `weight_scale_2` count is 400.
- `comfy_quant` count is 400.
- `{layer}.comfy_quant` decodes to `{"format": "nvfp4"}`.
- A sampled `blocks.0.cross_attn.k.weight` block equals the expected nibble
  swap from the original.
- The sampled `weight_scale_2` equals `alpha * input_global_scale`.
- The sampled `input_scale` equals `1 / input_global_scale`.