Sentence Similarity
sentence-transformers
PyTorch
ONNX
Safetensors
OpenVINO
Transformers
English
mpnet
fill-mask
feature-extraction
text-embeddings-inference
Eval Results
Instructions to use sentence-transformers/all-mpnet-base-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use sentence-transformers/all-mpnet-base-v2 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("sentence-transformers/all-mpnet-base-v2") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Transformers
How to use sentence-transformers/all-mpnet-base-v2 with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/all-mpnet-base-v2") model = AutoModelForMaskedLM.from_pretrained("sentence-transformers/all-mpnet-base-v2") - Inference
- Notebooks
- Google Colab
- Kaggle
Latency observed in Embedding computation
#4
by RajaRamKankipati - opened
Hi Team,
Implementing MPNET code for long documents which have more than 512 tokens in the following approach:
- Get all the tokens from the tokenizers without truncation
- Split the tokens in chunks of 512 and
- Pass the chunks to the model in a batch
encoded_input = tokenizer(
document,
max_length=None,
padding=True,
truncation=False,
return_tensors="pt",
).to(device)
encoded_input = pre_processing_encoded_input(encoded_input, size = 512)
# Compute token embeddings
with torch.no_grad():
model_output = self.model(**encoded_input)
With a simple encoded_input of 512 tokens, the model takes around 230ms to compute the embedding, with the array shape (2, 512) taking 2000ms and increasing exponentially, is there any way I can achieve low latency using the model for long documents ?