Instructions to use charles2530/Wan2.2-NVFP4-Sparse with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use charles2530/Wan2.2-NVFP4-Sparse with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("charles2530/Wan2.2-NVFP4-Sparse", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
| # Wan2.2 NVFP4 Sparse to ComfyUI Conversion Analysis | |
| ## Sources checked | |
| - Kijai Hugging Face repo: https://huggingface.co/Kijai/WanVideo_comfy_nvfp4 | |
| - ComfyUI Wan2.2 workflow docs: https://docs.comfy.org/tutorials/video/wan/wan2_2 | |
| - ComfyUI Wan2.2 examples: https://comfyanonymous.github.io/ComfyUI_examples/wan22/ | |
| - ComfyUI mixed precision loader reference: | |
| https://huggingface.co/mhnakif/comfy/blob/main/comfy/ops.py | |
| - ComfyUI quant op reference: | |
| https://huggingface.co/mhnakif/comfy/blob/main/comfy/quant_ops.py | |
| - Comfy Kitchen hardware/backend reference: | |
| https://github.com/Comfy-Org/comfy-kitchen | |
| - Local ComfyUI source checkout used for verification: | |
| `Comfy-Org/ComfyUI` commit | |
| `5aa71b9bc28809a16596bb9fa3d0a6300d8e3f0e` | |
| ## Script provenance | |
| `convert_lightx2v_nvfp4_to_comfy.py` is a local conversion script written for | |
| this directory. It is not copied from one upstream script. The implementation is | |
| derived from these upstream pages and ComfyUI source conventions: | |
| - Kijai model page: | |
| https://huggingface.co/Kijai/WanVideo_comfy_nvfp4 | |
| - This page gives the actual LightX2V NVFP4 to Comfy NVFP4 conversion rule: | |
| nibble-swap packed U8 weights, keep `weight_scale`, set | |
| `weight_scale_2 = alpha * input_global_scale`, and set | |
| `input_scale = 1 / input_global_scale`. | |
| - ComfyUI quantized loader: | |
| https://github.com/Comfy-Org/ComfyUI/blob/5aa71b9bc28809a16596bb9fa3d0a6300d8e3f0e/comfy/ops.py#L1058-L1091 | |
| - This loader reads `{layer}.comfy_quant`, branches on `format == "nvfp4"`, | |
| then requires `{layer}.weight_scale_2` and `{layer}.weight_scale`. | |
| - ComfyUI quant algorithm registry: | |
| https://github.com/Comfy-Org/ComfyUI/blob/5aa71b9bc28809a16596bb9fa3d0a6300d8e3f0e/comfy/quant_ops.py#L190-L205 | |
| - This defines the `nvfp4` storage dtype as `torch.uint8` and the parameter | |
| set as `weight_scale`, `weight_scale_2`, and `input_scale`. | |
| - ComfyUI quantization metadata handling: | |
| https://github.com/Comfy-Org/ComfyUI/blob/5aa71b9bc28809a16596bb9fa3d0a6300d8e3f0e/comfy/utils.py#L1360-L1421 | |
| - This shows that `_quantization_metadata.layers` is converted into | |
| `{layer}.comfy_quant` JSON byte tensors and that the presence of | |
| `.comfy_quant` enables mixed quantized ops. | |
| - ComfyUI native NVFP4 hardware gate: | |
| https://github.com/Comfy-Org/ComfyUI/blob/5aa71b9bc28809a16596bb9fa3d0a6300d8e3f0e/comfy/model_management.py#L1877-L1885 | |
| - This returns true only for NVIDIA GPUs with compute capability major | |
| version >= 10, which is why H100 can validate/load files but is not expected | |
| to use native Blackwell NVFP4 tensor-core compute. | |
| ## Format findings | |
| The original files in this directory are LightX2V NVFP4 Sparse safetensors. Each | |
| file has 400 quantized Linear layers with these LightX2V-side tensors: | |
| - `{layer}.weight`: packed NVFP4 values in `torch.uint8` | |
| - `{layer}.weight_scale`: FP8 E4M3 block scale tensor | |
| - `{layer}.alpha`: scalar post-matmul rescaler | |
| - `{layer}.input_global_scale`: scalar input scale convention | |
| Kijai's model card says the ComfyUI conversion is still NVFP4 and uses the same | |
| datatype, but changes conventions: | |
| - Swap the high/low nibbles in each packed `uint8` weight byte. | |
| - Keep `{layer}.weight_scale` as-is. | |
| - Convert `{layer}.alpha * {layer}.input_global_scale` into | |
| `{layer}.weight_scale_2`. | |
| - Convert `1 / {layer}.input_global_scale` into `{layer}.input_scale`. | |
| ComfyUI's mixed precision loader expects a `{layer}.comfy_quant` tensor | |
| containing JSON bytes. For NVFP4 it then loads: | |
| - `{layer}.weight` | |
| - `{layer}.weight_scale_2` | |
| - `{layer}.weight_scale` | |
| - optional registered parameters such as `{layer}.input_scale` | |
| The converted file metadata also includes `_quantization_metadata` with one | |
| `nvfp4` layer entry per quantized layer so ComfyUI can select mixed precision | |
| operations for the model. | |
| ## H100 note | |
| The conversion itself does not require a Blackwell GPU; it is a safetensors | |
| layout conversion. However, Comfy Kitchen documents `TensorCoreNVFP4Layout` as | |
| requiring SM >= 10.0 / Blackwell for native NVFP4 tensor-core acceleration. H100 | |
| is Hopper, so ComfyUI may disable native NVFP4 compute and run a fallback path. | |
| ## Script | |
| The conversion script is: | |
| ```bash | |
| python convert_lightx2v_nvfp4_to_comfy.py | |
| ``` | |
| Useful options: | |
| ```bash | |
| python convert_lightx2v_nvfp4_to_comfy.py --dry-run | |
| python convert_lightx2v_nvfp4_to_comfy.py --overwrite | |
| python convert_lightx2v_nvfp4_to_comfy.py input.safetensors --output-dir /path/to/out | |
| ``` | |
| The script writes `<original_stem>_comfy.safetensors` and uses a temporary file | |
| before renaming into place. | |
| ## Converted outputs | |
| - `Wan2.2-I2V-A14B_NVFP4_Sparse_high_comfy.safetensors` | |
| - `Wan2.2-I2V-A14B_NVFP4_Sparse_low_comfy.safetensors` | |
| - `Wan2.2-T2V-A14B_NVFP4_Sparse_high_comfy.safetensors` | |
| - `Wan2.2-T2V-A14B_NVFP4_Sparse_low_comfy.safetensors` | |
| For ComfyUI native workflows, place these diffusion model files under: | |
| ```text | |
| ComfyUI/models/diffusion_models/ | |
| ``` | |
| The Wan2.2 14B workflows still need the normal text encoder and VAE files in | |
| their ComfyUI locations. | |
| ## Verification performed | |
| For each converted file: | |
| - Tensor count is 2695. | |
| - `_quantization_metadata` contains 400 quantized layers. | |
| - `alpha` count is 0. | |
| - `input_global_scale` count is 0. | |
| - `input_scale` count is 400. | |
| - `weight_scale` count is 400. | |
| - `weight_scale_2` count is 400. | |
| - `comfy_quant` count is 400. | |
| - `{layer}.comfy_quant` decodes to `{"format": "nvfp4"}`. | |
| - A sampled `blocks.0.cross_attn.k.weight` block equals the expected nibble | |
| swap from the original. | |
| - The sampled `weight_scale_2` equals `alpha * input_global_scale`. | |
| - The sampled `input_scale` equals `1 / input_global_scale`. | |