updated readme
Browse files
README.md
CHANGED
|
@@ -62,7 +62,7 @@ source repo: [BSC-LT/salamandra-2b-instruct](https://huggingface.co/BSC-LT/salam
|
|
| 62 |
- **Non-recommended Quantizations:**
|
| 63 |
- **IQ3_M:** Represents the best of the I quantization types below Q4, achieving good size efficiency while maintaining low perplexity.
|
| 64 |
- **Q3_K_L:** Provides a slightly larger file size (1.8G) with an acceptable PPL (16.5067). While it meets the log PPL difference criteria, it is not as balanced as the recommended quantizations.
|
| 65 |
-
- An attempt was made to get a model below **IQ3_M** size, but perplexity was unacceptable even with **IQ2_M** (more than the 0.3 selection crteria, see next section).
|
| 66 |
|
| 67 |
---
|
| 68 |
|
|
|
|
| 62 |
- **Non-recommended Quantizations:**
|
| 63 |
- **IQ3_M:** Represents the best of the I quantization types below Q4, achieving good size efficiency while maintaining low perplexity.
|
| 64 |
- **Q3_K_L:** Provides a slightly larger file size (1.8G) with an acceptable PPL (16.5067). While it meets the log PPL difference criteria, it is not as balanced as the recommended quantizations.
|
| 65 |
+
- An attempt was made to get a model below **IQ3_M** size, but perplexity was unacceptable even with **IQ2_M** (more than the 0.3 selection crteria, see next section). If you need a model below 1.7GB, you may be better served by Richard Erkhov's [quantizations](https://huggingface.co/RichardErkhov/BSC-LT_-_salamandra-2b-instruct-gguf), which seem to be a static quantization instead of using an importance matrix, so they are smaller.
|
| 66 |
|
| 67 |
---
|
| 68 |
|