Upload folder using huggingface_hub
Browse files- .gitattributes +8 -0
- README.md +42 -0
- embeddinggemma-300M-BF16.gguf +3 -0
- embeddinggemma-300M-F16.gguf +3 -0
- embeddinggemma-300M-F32.gguf +3 -0
- embeddinggemma-300M-Q2_K.gguf +3 -0
- embeddinggemma-300M-Q4_K_M.gguf +3 -0
- embeddinggemma-300M-Q5_K_M.gguf +3 -0
- embeddinggemma-300M-Q6_K.gguf +3 -0
- embeddinggemma-300M-Q8_0.gguf +3 -0
- imgs/embgemma.png +0 -0
- sha256/embeddinggemma-300M-BF16.sha256 +1 -0
- sha256/embeddinggemma-300M-F16.sha256 +1 -0
- sha256/embeddinggemma-300M-F32.sha256 +1 -0
- sha256/embeddinggemma-300M-Q2_K.sha256 +1 -0
- sha256/embeddinggemma-300M-Q4_K_M.sha256 +1 -0
- sha256/embeddinggemma-300M-Q5_K_M.sha256 +1 -0
- sha256/embeddinggemma-300M-Q6_K.sha256 +1 -0
- sha256/embeddinggemma-300M-Q8_0.sha256 +1 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,11 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
embeddinggemma-300M-BF16.gguf filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
embeddinggemma-300M-F16.gguf filter=lfs diff=lfs merge=lfs -text
|
| 38 |
+
embeddinggemma-300M-F32.gguf filter=lfs diff=lfs merge=lfs -text
|
| 39 |
+
embeddinggemma-300M-Q2_K.gguf filter=lfs diff=lfs merge=lfs -text
|
| 40 |
+
embeddinggemma-300M-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
| 41 |
+
embeddinggemma-300M-Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
| 42 |
+
embeddinggemma-300M-Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
|
| 43 |
+
embeddinggemma-300M-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
|
@@ -0,0 +1,42 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
base_model:
|
| 3 |
+
- google/embeddinggemma-300m
|
| 4 |
+
language:
|
| 5 |
+
- en
|
| 6 |
+
model_creator: Google
|
| 7 |
+
model_name: embeddinggemma-300m
|
| 8 |
+
model_type: gemma-embedding
|
| 9 |
+
quantized_by: s3dev-ai
|
| 10 |
+
tags:
|
| 11 |
+
- sentence-similarity
|
| 12 |
+
---
|
| 13 |
+
|
| 14 |
+
# Overview
|
| 15 |
+
|
| 16 |
+
This page provides various quantisations of the [base model](https://huggingface.co/google/embeddinggemma-300m), in GGUF format.
|
| 17 |
+
- google/embeddinggemma-300m
|
| 18 |
+
|
| 19 |
+
# Model Description
|
| 20 |
+
|
| 21 |
+
For a full model description, please refer to the [base model's](https://huggingface.co/google/embeddinggemma-300m) card.
|
| 22 |
+
|
| 23 |
+
## How are the GGUF files created?
|
| 24 |
+
After cloning the author's original base model repository, `llama.cpp` is used to convert the model to a GGML compatible file, using `f32` as the output type; preserving the original fidelity. The model is converted *un-altered*, unless otherwise stated.
|
| 25 |
+
|
| 26 |
+
Finally, for each respective quantisation level, `llama.cpp`'s `llama-quantize` executable is called using the F32 GGUF file as the source file.
|
| 27 |
+
|
| 28 |
+
## Quantisations
|
| 29 |
+
|
| 30 |
+
To help visualise the difference in model quantisation (i.e. level of retained fidelity), the image below shows the cosine similarity scores for each quantisation, baselined against the 32-bit base model. It can be observed that lower fidelity yields a wider scatter in scores, relative to the 32-bit model.
|
| 31 |
+
|
| 32 |
+
The underlying [base dataset](https://huggingface.co/datasets/sentence-transformers/stsb) was sampled to 1000 records with a unbiased similarity score distribution. Using the various quantisation levels of this model, embeddings were created for `sentence1` and `sentence2`. Finally, a cosine similarity score was calculated across the two embeddings, and plotted on the graph.
|
| 33 |
+
|
| 34 |
+
> [!NOTE] **Note:** This graph currently only features a single trend, which was created against the un-quantised 32-bit model. Although the quantised GGUF files are available, neither `sentence-transformers` nor `llama-cpp-python` have been updated to support the `gemma-embedding` format, so we can't use them (yet).
|
| 35 |
+
>
|
| 36 |
+
> As soon as support is available, we'll update this graph to display the fidelity for the quantisations.
|
| 37 |
+
|
| 38 |
+
<!-- Image alignment -->
|
| 39 |
+
<div align="center">
|
| 40 |
+
<img src="imgs/embgemma.png" alt="Quantisation Levels" width="90%">
|
| 41 |
+
</div>
|
| 42 |
+
|
embeddinggemma-300M-BF16.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:54bdbd8516756d819f2ac1c0b39b2ace8b2c30c25f10545d2891cfde8c31ba53
|
| 3 |
+
size 612429792
|
embeddinggemma-300M-F16.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a025bd8fd2720415446420ada454a5187dd5d96ab93201627c0a7924baa6f14d
|
| 3 |
+
size 612429792
|
embeddinggemma-300M-F32.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a3125072128fc76d1c1d8d19f7b095c7e3bfbf00594dcf8a8bd3bcb334935d57
|
| 3 |
+
size 1217982432
|
embeddinggemma-300M-Q2_K.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5f71b5de27e76e0e290f8d0fb8a954f940b9797599d5aa97e39e00719e2d701b
|
| 3 |
+
size 212209632
|
embeddinggemma-300M-Q4_K_M.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:676216d6d8abbd717688905a77230a213f9db095793ee8c1afb1aa5bf11eb531
|
| 3 |
+
size 236337120
|
embeddinggemma-300M-Q5_K_M.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:31f806ff63e12b5eb74d57146de00f41ac536f112a2d0615e645a99d5fc9acb6
|
| 3 |
+
size 246732768
|
embeddinggemma-300M-Q6_K.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6cb77011e65793a6126ea6d317fc7ee31a20f4f5b59902fcc8a528e5fd57fa53
|
| 3 |
+
size 260390880
|
embeddinggemma-300M-Q8_0.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6145ab14054cf8420e8e11cc4680455566a7e865600ece8e06d75adaaf39032a
|
| 3 |
+
size 328576992
|
imgs/embgemma.png
ADDED
|
sha256/embeddinggemma-300M-BF16.sha256
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
54bdbd8516756d819f2ac1c0b39b2ace8b2c30c25f10545d2891cfde8c31ba53 embeddinggemma-300M-BF16.gguf
|
sha256/embeddinggemma-300M-F16.sha256
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
a025bd8fd2720415446420ada454a5187dd5d96ab93201627c0a7924baa6f14d embeddinggemma-300M-F16.gguf
|
sha256/embeddinggemma-300M-F32.sha256
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
a3125072128fc76d1c1d8d19f7b095c7e3bfbf00594dcf8a8bd3bcb334935d57 embeddinggemma-300M-F32.gguf
|
sha256/embeddinggemma-300M-Q2_K.sha256
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
5f71b5de27e76e0e290f8d0fb8a954f940b9797599d5aa97e39e00719e2d701b embeddinggemma-300M-Q2_K.gguf
|
sha256/embeddinggemma-300M-Q4_K_M.sha256
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
676216d6d8abbd717688905a77230a213f9db095793ee8c1afb1aa5bf11eb531 embeddinggemma-300M-Q4_K_M.gguf
|
sha256/embeddinggemma-300M-Q5_K_M.sha256
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
31f806ff63e12b5eb74d57146de00f41ac536f112a2d0615e645a99d5fc9acb6 embeddinggemma-300M-Q5_K_M.gguf
|
sha256/embeddinggemma-300M-Q6_K.sha256
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
6cb77011e65793a6126ea6d317fc7ee31a20f4f5b59902fcc8a528e5fd57fa53 embeddinggemma-300M-Q6_K.gguf
|
sha256/embeddinggemma-300M-Q8_0.sha256
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
6145ab14054cf8420e8e11cc4680455566a7e865600ece8e06d75adaaf39032a embeddinggemma-300M-Q8_0.gguf
|