Update README.md
Browse files
README.md
CHANGED
|
@@ -16,16 +16,16 @@ tags:
|
|
| 16 |
This page provides various quantisations of the [base model](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5), in GGUF format.
|
| 17 |
- nomic-ai/nomic-embed-text-v1.5
|
| 18 |
|
| 19 |
-
|
| 20 |
|
| 21 |
For a full model description, please refer to the [base model's](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5) card.
|
| 22 |
|
| 23 |
-
|
| 24 |
After cloning the author's original base model repository, `llama.cpp` is used to convert the model to a GGML compatible file, using `f32` as the output type; preserving the original fidelity. The model is converted *un-altered*, unless otherwise stated.
|
| 25 |
|
| 26 |
Finally, for each respective quantisation level, `llama.cpp`'s `llama-quantize` executable is called using the F32 GGUF file as the source file.
|
| 27 |
|
| 28 |
-
|
| 29 |
|
| 30 |
To help visualise the difference in model quantisation (i.e. level of retained fidelity), the image below shows the cosine similarity scores for each quanitsation, baselined against the 32-bit base model. It can be observed that lower fidelity yields a wider scatter in scores, relative to the 32-bit model.
|
| 31 |
|
|
|
|
| 16 |
This page provides various quantisations of the [base model](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5), in GGUF format.
|
| 17 |
- nomic-ai/nomic-embed-text-v1.5
|
| 18 |
|
| 19 |
+
# Model Description
|
| 20 |
|
| 21 |
For a full model description, please refer to the [base model's](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5) card.
|
| 22 |
|
| 23 |
+
## How are the GGUF files created?
|
| 24 |
After cloning the author's original base model repository, `llama.cpp` is used to convert the model to a GGML compatible file, using `f32` as the output type; preserving the original fidelity. The model is converted *un-altered*, unless otherwise stated.
|
| 25 |
|
| 26 |
Finally, for each respective quantisation level, `llama.cpp`'s `llama-quantize` executable is called using the F32 GGUF file as the source file.
|
| 27 |
|
| 28 |
+
## Quantisations
|
| 29 |
|
| 30 |
To help visualise the difference in model quantisation (i.e. level of retained fidelity), the image below shows the cosine similarity scores for each quanitsation, baselined against the 32-bit base model. It can be observed that lower fidelity yields a wider scatter in scores, relative to the 32-bit model.
|
| 31 |
|