Sentence Similarity
sentence-transformers
PyTorch
ONNX
xlm-roberta
feature-extraction
Eval Results
text-embeddings-inference
Instructions to use BAAI/bge-m3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use BAAI/bge-m3 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("BAAI/bge-m3") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Inference
- Notebooks
- Google Colab
- Kaggle
Reranker
#30
by Totole - opened
Hi, thanks a lot for your work !
Two questions:
- Is the model.compute_score(sentence_pairs, max_passage_length, weights_for_different_modes) just making a score (e.g. cosine) with the embeddings (dense, sparse, colbert) done by the model ? In other words, is it cross-encoding or bi-encoding ?
- Why does the max_length_token of this model seems to be 514 and not 8000 ?
Thanks for your interest in our work!
- The bge-m3 is bi-encoding model. Its
compute_scorefunction will summarize the scores from different embedding mode(dense, sparse, colbert) - The max length is 8192. You can see the config: https://huggingface.co/BAAI/bge-m3/blob/main/tokenizer_config.json
Besides, we release some new rerankers(cross-encoders): https://huggingface.co/BAAI/bge-reranker-v2-m3#model-list . Feel free to use them and provide your feedback.
Hello, I need more detailed information about the error.
- Can you run the code here successfully?
- Maybe you can paste your full code here, and then I will test it to see if this error can be reproduced.
For a very weird reason, it works on Colab but not on Azure ML...


