--- tags: - static-embeddings --- # Static Embeddings This project contains multilingual static embeddings that are appropriate for generating quick embeddings in edge devices. They are re-packaged from other projects in production ready assets. ## Models * [minishlab/potion-retrieval-32M/](models/minishlab/potion-retrieval-32M/README.md) * [minishlab/potion-multilingual-128M/](models/minishlab/potion-multilingual-128M/README.md) * [sentence-transformers/static-retrieval-mrl-en-v1/](models/sentence-transformers/static-retrieval-mrl-en-v1/README.md) * [sentence-transformers/static-similarity-mrl-multilingual-v1](models/sentence-transformers/static-similarity-mrl-multilingual-v1/README.md) ## Updating Add models to `scripts/build_models.py`. ```sh # Install dependencies and login to huggingface: pipx install huggingface_hub huggingface-cli login # Re-build the models: uv run scripts/build_models.py # Version control: git add . git commit -m 'Updated the models' git push git tag v1.0.0 -m 'Model release description' git push origin tag v1.0.0 # Upload the models uv run scripts/upload_models.py --tag v1.0.0 ``` ## Precision For static embeddings and cosine similarity, precision isn't as important. For an end to end to test in Firefox on some vectors here was the cosine similarity for the same mean pooled result. Note that the vector math happens in the f32 space, but storage for the embeddings is in a lower precision. > f32 vs f16: cosine similarity = 1.00000000
> → They are essentially identical in direction. > > f32 vs f8: cosine similarity = 0.99956375
> → Very close, only tiny quantization effects. Note that this was done on the `torch.float8_e4m3fn`, while `torch.float8_e5m2` generally has more loss. Precision also affects download size. For instance with larger [minishlab/potion-multilingual-128M/](models/minishlab/potion-multilingual-128M/README.md) model. The `fp32` is 228M compressed, while only 51M for `fp8_e4m3`, which has competetive quantization values. | precision | dimensions | size | | ------------- | ---------- | ------- | | fp32 | 128 | 228M | | fp16 | 128 | 114M | | **fp8_e4m3** | 128 | **51M** | | fp8_e5m2 | 128 | 44M |