Update README.md
Browse files
README.md
CHANGED
|
@@ -42,13 +42,13 @@ The only fix seems to be to delete the repo, which unfortunately also deletes th
|
|
| 42 |
The quant types I currently do regularly are:
|
| 43 |
|
| 44 |
- static: (f16) Q8_0 Q4_K_S Q2_K Q6_K Q3_K_M Q3_K_S Q3_K_L Q4_K_M Q5_K_S Q5_K_M IQ4_XS (Q4_0_4)
|
| 45 |
-
- imatrix: Q2_K Q4_K_S IQ3_XXS Q3_K_M Q4_K_M IQ2_M Q6_K IQ4_XS Q3_K_S Q3_K_L Q5_K_S Q5_K_M Q4_0 IQ3_XS IQ3_S IQ3_M IQ2_XXS IQ2_XS IQ2_S IQ1_M IQ1_S (Q4_0_4_4 Q4_0_4_8 Q4_0_8_8)
|
| 46 |
|
| 47 |
And they are generally (but not always) generated in the order above, for which there are deep reasons.
|
| 48 |
|
| 49 |
For models less than 11B size, I experimentally generate f16 versions at the moment (in the static repository).
|
| 50 |
|
| 51 |
-
For models less than
|
| 52 |
|
| 53 |
The (static) IQ3 quants are no longer generated, as they consistently seem to result in *much* lower quality
|
| 54 |
quants than even static Q2_K, so it would be s disservice to offer them.
|
|
|
|
| 42 |
The quant types I currently do regularly are:
|
| 43 |
|
| 44 |
- static: (f16) Q8_0 Q4_K_S Q2_K Q6_K Q3_K_M Q3_K_S Q3_K_L Q4_K_M Q5_K_S Q5_K_M IQ4_XS (Q4_0_4)
|
| 45 |
+
- imatrix: Q2_K Q4_K_S IQ3_XXS Q3_K_M (IQ4_NL) Q4_K_M IQ2_M Q6_K IQ4_XS Q3_K_S Q3_K_L Q5_K_S Q5_K_M Q4_0 IQ3_XS IQ3_S IQ3_M IQ2_XXS IQ2_XS IQ2_S IQ1_M IQ1_S (Q4_0_4_4 Q4_0_4_8 Q4_0_8_8)
|
| 46 |
|
| 47 |
And they are generally (but not always) generated in the order above, for which there are deep reasons.
|
| 48 |
|
| 49 |
For models less than 11B size, I experimentally generate f16 versions at the moment (in the static repository).
|
| 50 |
|
| 51 |
+
For models less than 19B size, imatrix IQ4_NL quants will be generated, mostly for the benefit of arm.
|
| 52 |
|
| 53 |
The (static) IQ3 quants are no longer generated, as they consistently seem to result in *much* lower quality
|
| 54 |
quants than even static Q2_K, so it would be s disservice to offer them.
|