Feature (kinda) - A couple of Quant Comparisons
Not a new column or anything like that - just - pick one or two stand-out / commonly used models, and make a few rows with them, scoring them against themselves at different quant levels - similarly to how you have "(thinking=true)" or "(THINK prefill)" like:
| Model | UGI | NatInt | Writing | Imaginary | Metrics |
|---|---|---|---|---|---|
| RossPerot-v1.0-70B | 23.2 | 34 | 42.0 | Donuts | Orange |
| Clerks-v2.0-9B:IQ2_XS | 9.6 | 14.4 | 28.8 | Pizza | Taupe |
| Clerks-v2.0-9B:Q3_K_M | 33.6 | 57.6 | 64 | Pizza | Puce |
| Clerks-v2.0-9B:Q4_K | 23 | 42 | 13 | Pizza | Puce |
| Rushmore-v1.1-123B:Q2_K | 43 | 43 | 43 | Bacon | Magenta |
| Rushmore-v1.1-123B:Q4_K_M | 47 | 46 | 48 | Bacon | Magenta |
Even if you only did this for one or two models, and only with one or two different quants, it could provide a lot of potential insight for folks using these models regularly.
I think it would be helpful to have at least some examples in quite-small and quite large, as the common wisdom is that the larger models tolerate quantization better, but it would be interesting to see if that's reflected in the numbers. Categories? IDK - 12B Mistral, 24B mistral, 70B L33, etc
Interesting idea.
Static vs imatrix
Q5_K_S vs iq4_xs
Just do like the most popular model from each lab one-Time feature type thing.
I'd even kinda like to see Unsloth vs mradacher vs bartowski quants. (Which, if I had to bet, there wouldn't be a winner, it would vary by model or time period, but who knows? (And that's why I'm so excited about it!) People ask all the time about quant choice, and the whole "I've got X GB of VRAM, should I use a Q2 70B or a Q5 32B?" question is still open for a lot of us.
Edit: Thats what I get for replying on my mobile device - accidentally closed the thread!