Add precision notes
Browse files
README.md
CHANGED
|
@@ -34,3 +34,19 @@ git push
|
|
| 34 |
git tag v1.0.0 -m 'Model release description'
|
| 35 |
git push origin tag v1.0.0
|
| 36 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 |
git tag v1.0.0 -m 'Model release description'
|
| 35 |
git push origin tag v1.0.0
|
| 36 |
```
|
| 37 |
+
|
| 38 |
+
## Precision
|
| 39 |
+
|
| 40 |
+
For static embeddings and cosine similarity, precision isn't as important. For an end
|
| 41 |
+
to end to test in Firefox on some vectors here was the cosine similarity for the same
|
| 42 |
+
mean pooled result. Note that the vector math happens in the f32 space, but storage
|
| 43 |
+
for the embeddings is in a lower precision.
|
| 44 |
+
|
| 45 |
+
f32 vs f16: cosine similarity = 1.00000000<br/>
|
| 46 |
+
→ They are essentially identical in direction.
|
| 47 |
+
|
| 48 |
+
f32 vs f8: cosine similarity = 0.99956375<br/>
|
| 49 |
+
→ Very close, only tiny quantization effects.
|
| 50 |
+
|
| 51 |
+
Note that this was done on the `torch.float8_e4m3fn`, while `torch.float8_e5m2` generally
|
| 52 |
+
has more loss.
|