Thank You For The Amazing Work!

by onixxexxd5555LOAF - opened about 16 hours ago

Your work has breathed a fresh life into my GPU. I am benefiting both from the extension and checkpoints you are making.
I made some quick comparisons for how the newest int8 variant works and I am pleased with the results.

If you are open to suggestions though, I would recommend making the dynamic lora behavior the default with option to disable it. The lora effect is just too weak without it in my experience and the >10% speed hit seems to be a necessary evil.

Finally I would like to ask if you have the full quantization pipeline script to make these checkpoints somewhere, if you want to share it. I am interested in trying to make some myself.

bertbobson

Owner about 12 hours ago

•

edited about 12 hours ago

Making your own is exceedingly simple:
https://github.com/BobJohnson24/ComfyUI-INT8-Fast/blob/main/example_workflows/int8_save_convrot_model.json

The issue with non dynamic lora is somewhat known, though it does appear even more affected in your example than I've seen in my own testing. I.e. the snr db for the tested loras was still in the ~25 range, which means there are mostly subtle changes. From a quick glance at your result that looks like it'd score around 8 at best. I'd be interested in trying out the lora if it is public to see if there are any additional bugs that need fixing.

Dynamic lora has of course it's own fair share of issues, from only supporting standard loras to slowing down inference speed.

My preferred approach is to utilize the pre-lora node/input, to merge into a BF16 model before converting with on the fly quantization/ConvRot on.
Of course that means you now have to keep a bf16 checkpoint around, but it supports more lora varieties and does not slow down inference beyond initial conversion speed (~25 seconds for example with Chroma BF16.)

bertbobson

Owner about 9 hours ago

Just as an example, I downloaded the first random chroma lora I could find, and it works just fine with non-dynamic lora.

onixxexxd5555LOAF

about 3 hours ago

Oh I didn't expect making a checkpoint to be so simple, thanks.
As for the dynamic lora, I grabbed an anima lora from civit, tried it with the dynamic lora option off, and it worked well indeed. (I was getting "aimdo: /project/src/model-vbar.c:74:WARNING:VBAR 0x7f063895a200: Page 273 pin_count=1" message at each step though, not sure if it is of relevance regardless.)
Then I tried another lora I made and it worked fine with the dynamic lora off.
This particular lora seems problem for some reason. I normally wasn't intending to release it publicly, but hoping that it helps to solve the problem, here is a temporary link:
"https://litter.catbox.moe/12piwnf7qn2d4fjl.safetensors"
TW info is in the metadata. I don't think there is anything highly unusual about it. It was trained on sd-scripts (but so do my other lora that works well with dynamic lora off.)
I will also share the workflow that lora didn't work with:
"https://litter.catbox.moe/sln10yhliq0pi4jf.png"
Thanks for the response.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment