Add files using upload-large-folder tool

e2634b7 verified 7 days ago

5.63 kB

	# Wan2.2 NVFP4 Sparse to ComfyUI Conversion Analysis

	## Sources checked

	- Kijai Hugging Face repo: https://huggingface.co/Kijai/WanVideo_comfy_nvfp4
	- ComfyUI Wan2.2 workflow docs: https://docs.comfy.org/tutorials/video/wan/wan2_2
	- ComfyUI Wan2.2 examples: https://comfyanonymous.github.io/ComfyUI_examples/wan22/
	- ComfyUI mixed precision loader reference:
	https://huggingface.co/mhnakif/comfy/blob/main/comfy/ops.py
	- ComfyUI quant op reference:
	https://huggingface.co/mhnakif/comfy/blob/main/comfy/quant_ops.py
	- Comfy Kitchen hardware/backend reference:
	https://github.com/Comfy-Org/comfy-kitchen
	- Local ComfyUI source checkout used for verification:
	`Comfy-Org/ComfyUI` commit
	`5aa71b9bc28809a16596bb9fa3d0a6300d8e3f0e`

	## Script provenance

	`convert_lightx2v_nvfp4_to_comfy.py` is a local conversion script written for
	this directory. It is not copied from one upstream script. The implementation is
	derived from these upstream pages and ComfyUI source conventions:

	- Kijai model page:
	https://huggingface.co/Kijai/WanVideo_comfy_nvfp4
	- This page gives the actual LightX2V NVFP4 to Comfy NVFP4 conversion rule:
	nibble-swap packed U8 weights, keep `weight_scale`, set
	`weight_scale_2 = alpha * input_global_scale`, and set
	`input_scale = 1 / input_global_scale`.
	- ComfyUI quantized loader:
	https://github.com/Comfy-Org/ComfyUI/blob/5aa71b9bc28809a16596bb9fa3d0a6300d8e3f0e/comfy/ops.py#L1058-L1091
	- This loader reads `{layer}.comfy_quant`, branches on `format == "nvfp4"`,
	then requires `{layer}.weight_scale_2` and `{layer}.weight_scale`.
	- ComfyUI quant algorithm registry:
	https://github.com/Comfy-Org/ComfyUI/blob/5aa71b9bc28809a16596bb9fa3d0a6300d8e3f0e/comfy/quant_ops.py#L190-L205
	- This defines the `nvfp4` storage dtype as `torch.uint8` and the parameter
	set as `weight_scale`, `weight_scale_2`, and `input_scale`.
	- ComfyUI quantization metadata handling:
	https://github.com/Comfy-Org/ComfyUI/blob/5aa71b9bc28809a16596bb9fa3d0a6300d8e3f0e/comfy/utils.py#L1360-L1421
	- This shows that `_quantization_metadata.layers` is converted into
	`{layer}.comfy_quant` JSON byte tensors and that the presence of
	`.comfy_quant` enables mixed quantized ops.
	- ComfyUI native NVFP4 hardware gate:
	https://github.com/Comfy-Org/ComfyUI/blob/5aa71b9bc28809a16596bb9fa3d0a6300d8e3f0e/comfy/model_management.py#L1877-L1885
	- This returns true only for NVIDIA GPUs with compute capability major
	version >= 10, which is why H100 can validate/load files but is not expected
	to use native Blackwell NVFP4 tensor-core compute.

	## Format findings

	The original files in this directory are LightX2V NVFP4 Sparse safetensors. Each
	file has 400 quantized Linear layers with these LightX2V-side tensors:

	- `{layer}.weight`: packed NVFP4 values in `torch.uint8`
	- `{layer}.weight_scale`: FP8 E4M3 block scale tensor
	- `{layer}.alpha`: scalar post-matmul rescaler
	- `{layer}.input_global_scale`: scalar input scale convention

	Kijai's model card says the ComfyUI conversion is still NVFP4 and uses the same
	datatype, but changes conventions:

	- Swap the high/low nibbles in each packed `uint8` weight byte.
	- Keep `{layer}.weight_scale` as-is.
	- Convert `{layer}.alpha * {layer}.input_global_scale` into
	`{layer}.weight_scale_2`.
	- Convert `1 / {layer}.input_global_scale` into `{layer}.input_scale`.

	ComfyUI's mixed precision loader expects a `{layer}.comfy_quant` tensor
	containing JSON bytes. For NVFP4 it then loads:

	- `{layer}.weight`
	- `{layer}.weight_scale_2`
	- `{layer}.weight_scale`
	- optional registered parameters such as `{layer}.input_scale`

	The converted file metadata also includes `_quantization_metadata` with one
	`nvfp4` layer entry per quantized layer so ComfyUI can select mixed precision
	operations for the model.

	## H100 note

	The conversion itself does not require a Blackwell GPU; it is a safetensors
	layout conversion. However, Comfy Kitchen documents `TensorCoreNVFP4Layout` as
	requiring SM >= 10.0 / Blackwell for native NVFP4 tensor-core acceleration. H100
	is Hopper, so ComfyUI may disable native NVFP4 compute and run a fallback path.

	## Script

	The conversion script is:

	```bash
	python convert_lightx2v_nvfp4_to_comfy.py
	```

	Useful options:

	```bash
	python convert_lightx2v_nvfp4_to_comfy.py --dry-run
	python convert_lightx2v_nvfp4_to_comfy.py --overwrite
	python convert_lightx2v_nvfp4_to_comfy.py input.safetensors --output-dir /path/to/out
	```

	The script writes `<original_stem>_comfy.safetensors` and uses a temporary file
	before renaming into place.

	## Converted outputs

	- `Wan2.2-I2V-A14B_NVFP4_Sparse_high_comfy.safetensors`
	- `Wan2.2-I2V-A14B_NVFP4_Sparse_low_comfy.safetensors`
	- `Wan2.2-T2V-A14B_NVFP4_Sparse_high_comfy.safetensors`
	- `Wan2.2-T2V-A14B_NVFP4_Sparse_low_comfy.safetensors`

	For ComfyUI native workflows, place these diffusion model files under:

	```text
	ComfyUI/models/diffusion_models/
	```

	The Wan2.2 14B workflows still need the normal text encoder and VAE files in
	their ComfyUI locations.

	## Verification performed

	For each converted file:

	- Tensor count is 2695.
	- `_quantization_metadata` contains 400 quantized layers.
	- `alpha` count is 0.
	- `input_global_scale` count is 0.
	- `input_scale` count is 400.
	- `weight_scale` count is 400.
	- `weight_scale_2` count is 400.
	- `comfy_quant` count is 400.
	- `{layer}.comfy_quant` decodes to `{"format": "nvfp4"}`.
	- A sampled `blocks.0.cross_attn.k.weight` block equals the expected nibble
	swap from the original.
	- The sampled `weight_scale_2` equals `alpha * input_global_scale`.
	- The sampled `input_scale` equals `1 / input_global_scale`.