What was the process to quantize?

by Lockout - opened Jan 21

Jan 21

I tried to load previous chroma I had quantized with https://github.com/silveroxides/convert_to_quant and got a big fat error.

bertbobson

Owner Jan 22

I tried to load previous chroma I had quantized with https://github.com/silveroxides/convert_to_quant and got a big fat error.

There is a lot of difference between int8 tensorwise (somewhat new, not previously implemented in convert to quant) and int8 blockwise, so it is not supported in my own node. No speed-up has been possible for blockwise.
This is the branch needed for tensorwise quantization: https://github.com/silveroxides/convert_to_quant/tree/feature/int8-refactor

I am currently requantizing the klein models for better quality and will upload a good chroma int8 model soon.

Here is the improved command that was given to me by silver:
`convert_to_quant -i ./Chroma1-HD.safetensors -o ./Chroma1-HD-int8tensormixed.safetensors --int8 --scaling_mode tensor --distillation_large --comfy_quant --save-quant-metadata --optimizer radam --calib_samples 8192 --num_iter 3000 --lr "7.1267000000029e-4" --top_p 0.2 --min_k 256 --max_k 1024 --lr_schedule plateau --lr_patience 5 --lr_factor 0.92 --lr_min "1e-9" --lr_cooldown 1 --lr_threshold 1e-8 --early-stop-lr 8e-9 --lr-shape-influence 1.5 --low-memory

Lockout

Jan 27

•

edited Jan 27

I will give it a try. I think he merged tensorwise by now. No wonder I get NaN trying the node on my old model. I do still have problems torch compiling.

edit: needs his kitchen fork too. with converted model it seems to edge out descaled fp8 on my turning card and compiles. think quality is slightly better.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment