Instructions to use sentence-transformers/all-MiniLM-L6-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use sentence-transformers/all-MiniLM-L6-v2 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Transformers
How to use sentence-transformers/all-MiniLM-L6-v2 with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/all-MiniLM-L6-v2") model = AutoModel.from_pretrained("sentence-transformers/all-MiniLM-L6-v2") - Inference
- Notebooks
- Google Colab
- Kaggle
Has all-MiniLM-L6-v2 ever changed it's architecture? I have two different models
I have two models of all-MiniLM-L6-v2 in ONNX format that I have converted to TensorRT format and been using for a while.
One model has the inputs:
- input_ids
- attention_mask
Outputs: - token_embeddings
- sentence_embedding
The other model has inputs:
- input_ids
- token_type_ids
- attention_mask
Outputs: - last_hidden_state
Can someone explain this, or has anyone run into this?
Hello!
No, it hasn't changed its architecture. However, you can load this model in 2 ways: via Sentence Transformers where your inputs are input_ids & attention_mask and the outputs are token_embeddings & sentence_embedding, and via transformers where your inputs are input_ids, token_type_ids, attention_mask and the output is last_hidden_state. See also the model card for a bit more info on this.
- Tom Aarsen
I'm loading the model via transformers using the AutoModel.from_pretrained() method and providing inputs of input_ids & attention_mask from the tokenizer (similarly loaded with AutoTokenizer) however the model is outputting last_hidden_state & pooler_output. Previously when I loaded the model in this way I was using the inputs & outputs described by @tomaarsen with AutoModel without issue, however loading the model via the SentenceTransformer class only outputs the pooled embeddings. Has something changed with the API or the model or am I doing something wrong?