8bpw KL div

#2
by UnstableLlama - opened

I noticed that the KL div of the 8pw is higher than the 5bpw, is that correct? That shouldn’t be.

I noticed this anomaly as well while compiling the measurement table. I've verified the numbers against my script logs from model_diff.py across all quantization levels - the numbers are accurate.

I haven't determined the root cause yet. It's possible that running the measurements on exllama v0.0.21 or later might yield different results - I'll try to investigate this tomorrow if I can find the time.

P.S. It's great to see someone actually reviewing the measurement table! 😊

Thank you, I've passed this along to this discord.

P.S. It's great to have measurement tables to read! I can't believe so many people still release unmeasured quants.

This is definitely not getting the right scores on v0.0.20. It looks badly broken, in fact. Testing on v0.0.22 gives more reasonable numbers:

Quant Size (GB) KL-div (quant, orig) KL-div (orig, quant) Perplexity Top-K K=1 Top-K K=2 Top-K K=3 Top-K K=4 Top-K K=5
5.0bpw 47 0.02707323 0.02719177 7.67866605 0.9371 0.8012 0.6333 0.4678 0.3284
8.0bpw 75 0.00895185 0.00893667 7.70824983 0.9655 0.8826 0.7671 0.6404 0.5158
original 148 - - 7.70801559 - - - - -

Sign up or log in to comment