Instructions to use BAAI/bge-m3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use BAAI/bge-m3 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("BAAI/bge-m3") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Inference
- Notebooks
- Google Colab
- Kaggle
Reduce dimension of colbert vector
I'm trying out the colbert vectors at the moment and I'm quite happy with the performance. I would however, if possible, need to reduce the dimension size of the embeddings and reshape them to 2-dim matrices(= one array/row for each embedded chunk to be clear). I'm aware that this will most likely affect the performance negatively, but I would still like to try if a solution like this is better than the one I've used before.
Anyone who's tried something similar?
Another approach to the question is if anyone knows a nice way for storage and retrieval of multi dimensional vectors?
@feliciaAI you can use vector databases like Datastax or Mongodb also released support for storing and indexing vectors.