NeuroSenko commited on
Commit
a6e91ea
·
verified ·
1 Parent(s): cd11968

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -8
README.md CHANGED
@@ -10,15 +10,20 @@ tags:
10
  - exl3
11
  ---
12
 
13
- Quantization was performed using [exllama3 v0.0.29](https://github.com/turboderp-org/exllamav3).
14
 
15
- [2.0bpw](https://huggingface.co/NeuroSenko/MiniMax-M2.7-exl3/tree/2.0bpw)
16
- [3.0bpw](https://huggingface.co/NeuroSenko/MiniMax-M2.7-exl3/tree/3.0bpw)
17
- [4.0bpw](https://huggingface.co/NeuroSenko/MiniMax-M2.7-exl3/tree/4.0bpw)
18
- [5.0bpw](https://huggingface.co/NeuroSenko/MiniMax-M2.7-exl3/tree/5.0bpw)
19
- [6.0bpw](https://huggingface.co/NeuroSenko/MiniMax-M2.7-exl3/tree/6.0bpw)
20
- [7.0bpw](https://huggingface.co/NeuroSenko/MiniMax-M2.7-exl3/tree/7.0bpw)
21
- [8.0bpw](https://huggingface.co/NeuroSenko/MiniMax-M2.7-exl3/tree/8.0bpw)
 
 
 
 
 
22
 
23
  <details>
24
  <summary>Quantization Notes</summary>
 
10
  - exl3
11
  ---
12
 
13
+ Quantization was performed using [exllama3 v0.0.29](https://github.com/turboderp-org/exllamav3) (commit `cb1a436`).
14
 
15
+ | Quant | Size (GB) | Actual bpw | Top-1 | Top-2 | Top-3 | Top-4 | Top-5 |
16
+ |---|---|---|---|---|---|---|---|
17
+ | [2.0bpw](https://huggingface.co/NeuroSenko/MiniMax-M2.7-exl3/tree/2.0bpw) | 55.14 | 2.00 | 76.0% | 41.8% | 18.5% | 7.1% | 2.5% |
18
+ | [3.0bpw](https://huggingface.co/NeuroSenko/MiniMax-M2.7-exl3/tree/3.0bpw) | 81.61 | 3.00 | 85.6% | 59.3% | 35.1% | 18.5% | 8.9% |
19
+ | [4.0bpw](https://huggingface.co/NeuroSenko/MiniMax-M2.7-exl3/tree/4.0bpw) | 108.09 | 4.00 | 90.3% | 70.5% | 49.0% | 31.2% | 18.5% |
20
+ | [5.0bpw](https://huggingface.co/NeuroSenko/MiniMax-M2.7-exl3/tree/5.0bpw) | 134.56 | 5.00 | 92.9% | 77.5% | 59.1% | 41.7% | 27.7% |
21
+ | [6.0bpw](https://huggingface.co/NeuroSenko/MiniMax-M2.7-exl3/tree/6.0bpw) | 161.18 | 6.00 | 94.4% | 81.5% | 65.2% | 49.1% | 35.0% |
22
+ | [7.0bpw](https://huggingface.co/NeuroSenko/MiniMax-M2.7-exl3/tree/7.0bpw) | 187.65 | 7.00 | 94.9% | 83.2% | 68.0% | 52.5% | 38.6% |
23
+ | [8.0bpw](https://huggingface.co/NeuroSenko/MiniMax-M2.7-exl3/tree/8.0bpw) | 214.13 | 8.00 | 95.2% | 84.0% | 69.5% | 54.4% | 40.7% |
24
+ | original | 214.36 | 8.00 | — | — | — | — | — |
25
+
26
+ \* Original model produces inf/NaN in layer 61, making PPL and KL divergence non-computable.
27
 
28
  <details>
29
  <summary>Quantization Notes</summary>