robbiemu
/

salamandra-2b-instruct

Text Generation

text-generation-inference

Model card Files Files and versions

robbiemu commited on Oct 17, 2024

Commit

5f82441

·

1 Parent(s): a3e8a7f

updated readme

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -62,7 +62,7 @@ source repo: [BSC-LT/salamandra-2b-instruct](https://huggingface.co/BSC-LT/salam
 - **Non-recommended Quantizations:**
   - **IQ3_M:** Represents the best of the I quantization types below Q4, achieving good size efficiency while maintaining low perplexity.
   - **Q3_K_L:** Provides a slightly larger file size (1.8G) with an acceptable PPL (16.5067). While it meets the log PPL difference criteria, it is not as balanced as the recommended quantizations.
-- An attempt was made to get a model below **IQ3_M** size, but perplexity was unacceptable even with **IQ2_M** (more than the 0.3 selection crteria, see next section).
 ---

 - **Non-recommended Quantizations:**
   - **IQ3_M:** Represents the best of the I quantization types below Q4, achieving good size efficiency while maintaining low perplexity.
   - **Q3_K_L:** Provides a slightly larger file size (1.8G) with an acceptable PPL (16.5067). While it meets the log PPL difference criteria, it is not as balanced as the recommended quantizations.
+- An attempt was made to get a model below **IQ3_M** size, but perplexity was unacceptable even with **IQ2_M** (more than the 0.3 selection crteria, see next section). If you need a model below 1.7GB, you may be better served by Richard Erkhov's [quantizations](https://huggingface.co/RichardErkhov/BSC-LT_-_salamandra-2b-instruct-gguf), which seem to be a static quantization instead of using an importance matrix, so they are smaller.
 ---