Please support INT8 Model loader for x2 Speed boost

#62

by AndyDev92 - opened 6 days ago

I get a 100% speed boost, x2 with a INT8 LTX 2.3 model. Generation speed halved on a rtx 3060. Can u please support INT8 model loader + lora support? That would be a dream. There are already different INT8 models: https://huggingface.co/bertbobson/ComfyUI-INT8_ConvRot/tree/main and iam using the custom node from here https://github.com/overpresentme/ComfyUI-ltx-int8-loader/ to load the model, u just replace the model loader. Only problem: Loras do not work. There is another one here which work with loras https://github.com/BobJohnson24/ComfyUI-INT8-Fast BUT: Model initialization is very very slow, longer than the generation itself, idk why. I hope u can do it, thank you from my heart.

RuneXX

6 days ago

Got an old rusty 3xxx myself, so i tried that INT8 loader actually... For sure a speed-up at the actual inference...
I didnt try any loras though.

Something more robust and native to comfy would be great, but from the fraction of comments here and there, i think maybe its not as easy as it might sound. And FP8 being the better alternative.
But Kijai will know better ;-)

anr2me

5 days ago

For GPUs that doesn't support FP8 natively, INT8 will certainly be faster 🤔 since FP8 will be upcasted to BF16/FP16.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment