NVFP4 text encoder possible?

#10

by hidden2u - opened Feb 5

Discussion

hidden2u

Feb 5

The I2V models work surprisingly well, with speed improvements!

I notice LTX-2 shipped with the text encoder at FP4, would it be possible with umt5_xxl as well?

GitMylo

Owner Feb 5

Of course, I did make an ltx 2 gemma text encoder nvfp4 quant before it was even supported, support was added after, although with umt5_xxl I don't think there will be issues.
I should be able to quantize https://huggingface.co/Kijai/WanVideo_comfy/blob/main/umt5-xxl-enc-bf16.safetensors if I make a new model def.

GitMylo

Owner Feb 5

I think a config like

e".weight"
 ".q." | ".k." | ".v." | ".fc1." = 0
 ".o." | ".fc2." = 1
x

Will work, I'll upload the model once it's tested and confirmed working

GitMylo

Owner Feb 5

Sorry, it seems that currently it's not supported in ComfyUI, leading to a size mismatch error when the embeddings are used. I could upload the model if you want but it's not currently usable until comfy is updated to support it. Right now it seems to ignore the quantization metadata, and since nvfp4 is stored in uint8 (so 8 bit), when the data is loaded without properly following the quant rules, the tensor will be half the size, which will lead to a size mismatch error sooner or later.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment