Update README.md
Browse files
README.md
CHANGED
|
@@ -54,12 +54,12 @@ Q6_K_S : Q6_K
|
|
| 54 |
|
| 55 |
Comparison:
|
| 56 |
|
| 57 |
-
Quant
|
| 58 |
---------|---------|------|-----------
|
| 59 |
Q4_K_M | 18.1e9 | 11.0 | Q4 embed Q6 out
|
| 60 |
Q4_K_H | 18.0e9 | 11.0 | Hybrid quant with Q6 embed Q6 out
|
| 61 |
-
Q6_K | 24.6e9
|
| 62 |
-
Q6_K_H | 22.5e9
|
| 63 |
|
| 64 |
The quant was evaluated for good reasoning performance across a curated set of test prompts. Experimenting with different
|
| 65 |
quant levels determined this model to be quite sensitive to losing coherent reasoning ability when quantization to
|
|
|
|
| 54 |
|
| 55 |
Comparison:
|
| 56 |
|
| 57 |
+
Quant | size | PPL | Comment
|
| 58 |
---------|---------|------|-----------
|
| 59 |
Q4_K_M | 18.1e9 | 11.0 | Q4 embed Q6 out
|
| 60 |
Q4_K_H | 18.0e9 | 11.0 | Hybrid quant with Q6 embed Q6 out
|
| 61 |
+
Q6_K | 24.6e9 | 10.7 | -
|
| 62 |
+
Q6_K_H | 22.5e9 | 10.7 | Hybrid quant with Q6_K embedding Q6_K output
|
| 63 |
|
| 64 |
The quant was evaluated for good reasoning performance across a curated set of test prompts. Experimenting with different
|
| 65 |
quant levels determined this model to be quite sensitive to losing coherent reasoning ability when quantization to
|