s3dev-ai
/

nomic-embed-text-v1.5-gguf

Sentence Similarity

feature-extraction

Model card Files Files and versions

s3dev-ai commited on Nov 1

Commit

6aa46f6

·

verified ·

1 Parent(s): 94bcb27

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -16,16 +16,16 @@ tags:
 This page provides various quantisations of the [base model](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5), in GGUF format.
 - nomic-ai/nomic-embed-text-v1.5
-## Model Description
 For a full model description, please refer to the [base model's](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5) card.
-### How are the GGUF files created?
 After cloning the author's original base model repository, `llama.cpp` is used to convert the model to a GGML compatible file, using `f32` as the output type; preserving the original fidelity. The model is converted *un-altered*, unless otherwise stated.
 Finally, for each respective quantisation level, `llama.cpp`'s `llama-quantize` executable is called using the F32 GGUF file as the source file.
-### Quantisations
 To help visualise the difference in model quantisation (i.e. level of retained fidelity), the image below shows the cosine similarity scores for each quanitsation, baselined against the 32-bit base model. It can be observed that lower fidelity yields a wider scatter in scores, relative to the 32-bit model.

 This page provides various quantisations of the [base model](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5), in GGUF format.
 - nomic-ai/nomic-embed-text-v1.5
+# Model Description
 For a full model description, please refer to the [base model's](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5) card.
+## How are the GGUF files created?
 After cloning the author's original base model repository, `llama.cpp` is used to convert the model to a GGML compatible file, using `f32` as the output type; preserving the original fidelity. The model is converted *un-altered*, unless otherwise stated.
 Finally, for each respective quantisation level, `llama.cpp`'s `llama-quantize` executable is called using the F32 GGUF file as the source file.
+## Quantisations
 To help visualise the difference in model quantisation (i.e. level of retained fidelity), the image below shows the cosine similarity scores for each quanitsation, baselined against the 32-bit base model. It can be observed that lower fidelity yields a wider scatter in scores, relative to the 32-bit model.