NeuroSenko commited on
Commit
39336e9
·
verified ·
1 Parent(s): eab11fb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -6
README.md CHANGED
@@ -15,12 +15,14 @@ tags:
15
  ---
16
  Quantization was performed using [exllamav3 v0.0.28](https://github.com/turboderp-org/exllamav3) (commit `ea87af6`).
17
 
18
- | Quant | Size (GB) | KL-div (quant, orig) | KL-div (orig, quant) | Perplexity | Top-K K=1 | Top-K K=2 | Top-K K=3 | Top-K K=4 | Top-K K=5 |
19
- |---|---|---|---|---|---|---|---|---|---|
20
- | [4.0bpw](https://huggingface.co/NeuroSenko/ToriiGate-0.5-exl3/tree/4.0bpw) | 4 | 0.01497847 | 0.01498618 | 29.09997983 | 0.9314 | 0.8033 | 0.6481 | 0.4939 | 0.3580 |
21
- | [6.0bpw](https://huggingface.co/NeuroSenko/ToriiGate-0.5-exl3/tree/6.0bpw) | 5 | 0.00126097 | 0.00126228 | 28.66642918 | 0.9794 | 0.9368 | 0.8761 | 0.8010 | 0.7184 |
22
- | [8.0bpw](https://huggingface.co/NeuroSenko/ToriiGate-0.5-exl3/tree/8.0bpw) | 6 | 0.00020025 | 0.00020012 | 28.62085845 | 0.9910 | 0.9724 | 0.9443 | 0.9079 | 0.8637 |
23
- | original | 10 | - | - | 28.59648210 | - | - | - | - | - |
 
 
24
 
25
  ### Metrics
26
  - **PPL** (Perplexity) — how well the model predicts the next token. Lower is better. The original model's PPL is the baseline.
 
15
  ---
16
  Quantization was performed using [exllamav3 v0.0.28](https://github.com/turboderp-org/exllamav3) (commit `ea87af6`).
17
 
18
+ | Quant | Size (GB) | Actual bpw | PPL | KL-div (q→o) | KL-div (o→q) | Top-1 | Top-2 | Top-3 | Top-4 | Top-5 |
19
+ |---|---|---|---|---|---|---|---|---|---|---|
20
+ | [4.0bpw](https://huggingface.co/NeuroSenko/ToriiGate-0.5-exl3/tree/4.0bpw) | 3.94 | 4.00 | 29.100 | 0.0150 | 0.0150 | 93.1% | 80.3% | 64.8% | 49.4% | 35.8% |
21
+ | [5.0bpw](https://huggingface.co/NeuroSenko/ToriiGate-0.5-exl3/tree/5.0bpw) | 4.51 | 5.00 | 28.854 | 0.0042 | 0.0042 | 96.2% | 88.6% | 78.6% | 67.2% | 55.8% |
22
+ | [6.0bpw](https://huggingface.co/NeuroSenko/ToriiGate-0.5-exl3/tree/6.0bpw) | 4.92 | 6.00 | 28.666 | 0.0013 | 0.0013 | 97.9% | 93.7% | 87.6% | 80.1% | 71.8% |
23
+ | [7.0bpw](https://huggingface.co/NeuroSenko/ToriiGate-0.5-exl3/tree/7.0bpw) | 5.34 | 7.00 | 28.610 | 0.0004 | 0.0004 | 98.7% | 96.0% | 92.2% | 87.2% | 81.4% |
24
+ | [8.0bpw](https://huggingface.co/NeuroSenko/ToriiGate-0.5-exl3/tree/8.0bpw) | 5.75 | 8.00 | 28.621 | 0.0002 | 0.0002 | 99.1% | 97.2% | 94.4% | 90.8% | 86.4% |
25
+ | original | 9.66 | 16.00 | 28.596 | — | — | — | — | — | — | — |
26
 
27
  ### Metrics
28
  - **PPL** (Perplexity) — how well the model predicts the next token. Lower is better. The original model's PPL is the baseline.