Instructions to use symanto/mpnet-base-snli-mnli with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use symanto/mpnet-base-snli-mnli with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("zero-shot-classification", model="symanto/mpnet-base-snli-mnli")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("symanto/mpnet-base-snli-mnli") model = AutoModelForSequenceClassification.from_pretrained("symanto/mpnet-base-snli-mnli") - Notebooks
- Google Colab
- Kaggle
Typo in Model card
#1
by quaeast - opened
In the 9th line of the usage python snippet, the separator of two sentence is </s></s>. Should it be </s><s> ?
inputs = tokenizer(["</s></s>".join(input_pair) for input_pair in input_pairs], return_tensors="pt")
Hi! Sorry we forgot to answer 🙏🏻
Actually it seems that for NLI tasks it isn't as you say. You can check that passing to the tokenizer directly the list of tuples (premise, hypothesis) and then check the token_ids:
input_pairs = [("I like this pizza.", "The sentence is positive."), ("I like this pizza.", "The sentence is negative.")]
inputs = tokenizer(input_pairs, return_tensors="pt")
# Output
#{'input_ids': tensor([[ 0, 1049, 2070, 2027, 10737, 1016, 2, 2, 2000, 6255,
# 2007, 3897, 1016, 2],
# [ 0, 1049, 2070, 2027, 10737, 1016, 2, 2, 2000, 6255,
# 2007, 5001, 1016, 2]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
# [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])}
If you check the vocab.txt file of the model, you'll see that the token_id=2 is </s>.
Cheers!