Sentence Similarity
sentence-transformers
PyTorch
ONNX
Safetensors
OpenVINO
Transformers
English
mpnet
fill-mask
feature-extraction
text-embeddings-inference
Eval Results
Instructions to use sentence-transformers/all-mpnet-base-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use sentence-transformers/all-mpnet-base-v2 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("sentence-transformers/all-mpnet-base-v2") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Transformers
How to use sentence-transformers/all-mpnet-base-v2 with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/all-mpnet-base-v2") model = AutoModelForMaskedLM.from_pretrained("sentence-transformers/all-mpnet-base-v2") - Inference
- Notebooks
- Google Colab
- Kaggle
Change max_position_embeddings to 512
#21
by vkehfdl1 - opened
When embedding in CUDA-enable environment, the Device side Assertion error occur when I try to put more than 512 embedding dimension. It looks like the config.json is wrong, the real max_position_embeddings are 512.
I think you might be right, 512 seems more reasonable. However, you should try and limit the sequence length to 384 as defined here: https://huggingface.co/sentence-transformers/all-mpnet-base-v2/blob/main/sentence_bert_config.json#L2
This is what the model was trained for, and what it should perform best with.