Feature (kinda) - A couple of Quant Comparisons

#581
by GeoMaciolek - opened

Not a new column or anything like that - just - pick one or two stand-out / commonly used models, and make a few rows with them, scoring them against themselves at different quant levels - similarly to how you have "(thinking=true)" or "(THINK prefill)" like:

Model UGI NatInt Writing Imaginary Metrics
RossPerot-v1.0-70B 23.2 34 42.0 Donuts Orange
Clerks-v2.0-9B:IQ2_XS 9.6 14.4 28.8 Pizza Taupe
Clerks-v2.0-9B:Q3_K_M 33.6 57.6 64 Pizza Puce
Clerks-v2.0-9B:Q4_K 23 42 13 Pizza Puce
Rushmore-v1.1-123B:Q2_K 43 43 43 Bacon Magenta
Rushmore-v1.1-123B:Q4_K_M 47 46 48 Bacon Magenta

Even if you only did this for one or two models, and only with one or two different quants, it could provide a lot of potential insight for folks using these models regularly.

I think it would be helpful to have at least some examples in quite-small and quite large, as the common wisdom is that the larger models tolerate quantization better, but it would be interesting to see if that's reflected in the numbers. Categories? IDK - 12B Mistral, 24B mistral, 70B L33, etc

Interesting idea.

Static vs imatrix

Q5_K_S vs iq4_xs

Just do like the most popular model from each lab one-Time feature type thing.

I'd even kinda like to see Unsloth vs mradacher vs bartowski quants. (Which, if I had to bet, there wouldn't be a winner, it would vary by model or time period, but who knows? (And that's why I'm so excited about it!) People ask all the time about quant choice, and the whole "I've got X GB of VRAM, should I use a Q2 70B or a Q5 32B?" question is still open for a lot of us.

Edit: Thats what I get for replying on my mobile device - accidentally closed the thread!

GeoMaciolek changed discussion status to closed
GeoMaciolek changed discussion status to open

Sign up or log in to comment