Instructions to use BAAI/bge-m3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use BAAI/bge-m3 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("BAAI/bge-m3") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Inference
- Notebooks
- Google Colab
- Kaggle
How many GPU's are required to fine tuning bge-m3 over 1 million tripplets ?
Congrulation to all the team of BAAI for the excellent work!
Actually I am collecting 1 million of tripplets (query, list[pos] , list[neg] ). Now, I wonder how many GPU's are required for the fine tuning?
Any suggestion is welcomed friends.
Thanks for your interest in our work! I think 8*A100 is enough.
@wilfoderek were you able to finetune the model. I fine-tuned model and is now giving me .9995 similarity score for everything no matter what the string is. I must have goofed up the training process I guess.
@wilfoderek were you able to finetune the model. I fine-tuned model and is now giving me .9995 similarity score for everything no matter what the string is. I must have goofed up the training process I guess.
Still working on collecting data! But I see , as you describe your problem might to be relationated with overfitting.
@dlitoria how much GPU VRAM did it take to fine tune it? While evaluating it, it took 20GB of my GPU, so I wonder if I'm doing anything wrong?