NVFP4 text encoder possible?

#10
by hidden2u - opened

The I2V models work surprisingly well, with speed improvements!

I notice LTX-2 shipped with the text encoder at FP4, would it be possible with umt5_xxl as well?

Of course, I did make an ltx 2 gemma text encoder nvfp4 quant before it was even supported, support was added after, although with umt5_xxl I don't think there will be issues.
I should be able to quantize https://huggingface.co/Kijai/WanVideo_comfy/blob/main/umt5-xxl-enc-bf16.safetensors if I make a new model def.

I think a config like

e".weight"
 ".q." | ".k." | ".v." | ".fc1." = 0
 ".o." | ".fc2." = 1
x

Will work, I'll upload the model once it's tested and confirmed working

Sorry, it seems that currently it's not supported in ComfyUI, leading to a size mismatch error when the embeddings are used. I could upload the model if you want but it's not currently usable until comfy is updated to support it. Right now it seems to ignore the quantization metadata, and since nvfp4 is stored in uint8 (so 8 bit), when the data is loaded without properly following the quant rules, the tensor will be half the size, which will lead to a size mismatch error sooner or later.

Sign up or log in to comment