steampunque
/

GLM-4.7-Flash-MP-GGUF

4-bit precision

Model card Files Files and versions

steampunque commited on Feb 18

Commit

99a3ef0

·

verified ·

1 Parent(s): e43d5e9

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -54,12 +54,12 @@ Q6_K_S : Q6_K
 Comparison:
-Quant |  size  |  PPL |   Comment
 ---------|---------|------|-----------
 Q4_K_M   |  18.1e9 | 11.0 | Q4 embed Q6 out
 Q4_K_H   |  18.0e9 | 11.0 | Hybrid quant with Q6 embed Q6 out
-Q6_K     | 24.6e9   | 10.7  | -
-Q6_K_H   | 22.5e9   | 10.7  | Hybrid quant with Q6_K embedding Q6_K output
 The quant was evaluated for good reasoning performance across a curated set of test prompts. Experimenting with different
 quant levels determined this model to be quite sensitive to losing coherent reasoning ability when quantization to

 Comparison:
+Quant    |  size   |  PPL |   Comment
 ---------|---------|------|-----------
 Q4_K_M   |  18.1e9 | 11.0 | Q4 embed Q6 out
 Q4_K_H   |  18.0e9 | 11.0 | Hybrid quant with Q6 embed Q6 out
+Q6_K     | 24.6e9  | 10.7  | -
+Q6_K_H   | 22.5e9  | 10.7  | Hybrid quant with Q6_K embedding Q6_K output
 The quant was evaluated for good reasoning performance across a curated set of test prompts. Experimenting with different
 quant levels determined this model to be quite sensitive to losing coherent reasoning ability when quantization to