rajpurkar/squad_v2
Viewer • Updated • 142k • 37.2k • 251
How to use LLukas22/all-MiniLM-L12-v2-qa-all with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("question-answering", model="LLukas22/all-MiniLM-L12-v2-qa-all") # Load model directly
from transformers import AutoTokenizer, AutoModelForQuestionAnswering
tokenizer = AutoTokenizer.from_pretrained("LLukas22/all-MiniLM-L12-v2-qa-all")
model = AutoModelForQuestionAnswering.from_pretrained("LLukas22/all-MiniLM-L12-v2-qa-all")This model is an extractive qa model. It's a fine-tuned version of all-MiniLM-L12-v2 on the following datasets: squad_v2, LLukas22/nq-simplified, newsqa, LLukas22/NLQuAD, deepset/germanquad.
You can use the model like this:
from transformers import pipeline
#Make predictions
model_name = "LLukas22/all-MiniLM-L12-v2-qa-all"
nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)
QA_input = {
"question": "What's my name?",
"context": "My name is Clara and I live in Berkeley."
}
result = nlp(QA_input)
print(result)
Alternatively you can load the model and tokenizer on their own:
from transformers import AutoModelForQuestionAnswering, AutoTokenizer
#Make predictions
model_name = "LLukas22/all-MiniLM-L12-v2-qa-all"
model = AutoModelForQuestionAnswering.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
The following hyperparameters were used during training:
| Epoch | Train Loss | Validation Loss |
|---|---|---|
| 0 | 3.76 | 3.02 |
| 1 | 2.57 | 2.23 |
| 2 | 2.2 | 2.08 |
| 3 | 2.07 | 2.03 |
| 4 | 1.96 | 1.97 |
| 5 | 1.87 | 1.93 |
| 6 | 1.81 | 1.91 |
| 7 | 1.77 | 1.89 |
| 8 | 1.73 | 1.89 |
| 9 | 1.7 | 1.9 |
| 10 | 1.68 | 1.9 |
| 11 | 1.67 | 1.9 |
| Epoch | f1 | exact_match |
|---|---|---|
| 0 | 0.29 | 0.228 |
| 1 | 0.371 | 0.329 |
| 2 | 0.413 | 0.369 |
| 3 | 0.437 | 0.376 |
| 4 | 0.454 | 0.388 |
| 5 | 0.468 | 0.4 |
| 6 | 0.479 | 0.408 |
| 7 | 0.487 | 0.415 |
| 8 | 0.495 | 0.421 |
| 9 | 0.501 | 0.416 |
| 10 | 0.506 | 0.42 |
| 11 | 0.51 | 0.421 |
This model was trained as part of my Master's Thesis 'Evaluation of transformer based language models for use in service information systems'. The source code is available on Github.