Instructions to use sentence-transformers/all-MiniLM-L6-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use sentence-transformers/all-MiniLM-L6-v2 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Transformers
How to use sentence-transformers/all-MiniLM-L6-v2 with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/all-MiniLM-L6-v2") model = AutoModel.from_pretrained("sentence-transformers/all-MiniLM-L6-v2") - Inference
- Notebooks
- Google Colab
- Kaggle
Memory is becoming fully exhausted during the generation of embeddings, leading to a complete server crash.
Hi all
I am trying to create embeddings for 15 lakh rows of data using sentence-transformers/all-MiniLM-L6-v2 for an application and upload embeddings to pgVector database.
While creating embeddings the server memory is getting completely exhausted and getting crashed.
Please help me here.
Hello!
I'm aware of this issue. The gist is that as more of the texts get turned into embeddings, the already processed embeddings all remain in memory until all texts have been processed. This can lead to high memory usage. My recommendation at this time is to chunk your texts and only process e.g. 1 lakh sentences at a time, upload those embeddings, and then do the next chunk.
Hope this helps.
- Tom Aarsen
Hey @tomaarsen thank you for your reply,
For now i am just going POC. If this is successful i will scale up the same for 5 Crore + rows of data. In that case this way of implementing is not suggestable.
Is there any way to do parallel processing for creation of embeddings.
Yes, you can use https://sbert.net/docs/package_reference/SentenceTransformer.html#sentence_transformers.SentenceTransformer.encode_multi_process for encoding on multiple processes or multiple GPUs, but the memory issue might still persist then. Chunking remains a good option I think.
will check the above link and get back to you asap.
thanks !!!