Sentence Similarity
sentence-transformers
PyTorch
TensorFlow
ONNX
Safetensors
OpenVINO
English
bert
mteb
Sentence Transformers
Eval Results (legacy)
text-embeddings-inference
Instructions to use intfloat/e5-small-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use intfloat/e5-small-v2 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("intfloat/e5-small-v2") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Inference
- Notebooks
- Google Colab
- Kaggle
High scores
#8
by drmeir - opened
The model always produces high similarity scores. For example, the similarity score between the word King and the word Dog is 0.796. I was not able to get a score below 0.7 for any pair of words. This seems wrong... What am I missing? How do I get scores that make intuitive sense?
Please refer to the related discussions at https://huggingface.co/intfloat/multilingual-e5-large/discussions/10 and https://github.com/microsoft/unilm/issues/1216
I will update the model card to add some clarification to avoid future confusions.
Thank you. I commented there: https://github.com/microsoft/unilm/issues/1216#issuecomment-1668474246