Instructions to use sjrhuschlee/deberta-v3-large-squad2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use sjrhuschlee/deberta-v3-large-squad2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("question-answering", model="sjrhuschlee/deberta-v3-large-squad2")# Load model directly from transformers import AutoTokenizer, AutoModelForQuestionAnswering tokenizer = AutoTokenizer.from_pretrained("sjrhuschlee/deberta-v3-large-squad2") model = AutoModelForQuestionAnswering.from_pretrained("sjrhuschlee/deberta-v3-large-squad2") - PEFT
How to use sjrhuschlee/deberta-v3-large-squad2 with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
deberta-v3-large for Extractive QA
This is the deberta-v3-large model, fine-tuned using the SQuAD2.0 dataset. It's been trained on question-answer pairs, including unanswerable questions, for the task of Extractive Question Answering.
This model was trained using LoRA available through the PEFT library.
Overview
Language model: deberta-v3-large
Language: English
Downstream-task: Extractive QA
Training data: SQuAD 2.0
Eval data: SQuAD 2.0
Infrastructure: 1x NVIDIA 3070
Model Usage
Using Transformers
This uses the merged weights (base model weights + LoRA weights) to allow for simple use in Transformers pipelines. It has the same performance as using the weights separately when using the PEFT library.
import torch
from transformers import(
AutoModelForQuestionAnswering,
AutoTokenizer,
pipeline
)
model_name = "sjrhuschlee/deberta-v3-large-squad2"
# a) Using pipelines
nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)
qa_input = {
'question': 'Where do I live?',
'context': 'My name is Sarah and I live in London'
}
res = nlp(qa_input)
# {'score': 0.984, 'start': 30, 'end': 37, 'answer': ' London'}
# b) Load model & tokenizer
model = AutoModelForQuestionAnswering.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
question = 'Where do I live?'
context = 'My name is Sarah and I live in London'
encoding = tokenizer(question, context, return_tensors="pt")
start_scores, end_scores = model(
encoding["input_ids"],
attention_mask=encoding["attention_mask"],
return_dict=False
)
all_tokens = tokenizer.convert_ids_to_tokens(input_ids[0].tolist())
answer_tokens = all_tokens[torch.argmax(start_scores):torch.argmax(end_scores) + 1]
answer = tokenizer.decode(tokenizer.convert_tokens_to_ids(answer_tokens))
# 'London'
Metrics
# Squad v2
{
"eval_HasAns_exact": 84.83468286099865,
"eval_HasAns_f1": 90.48374860633226,
"eval_HasAns_total": 5928,
"eval_NoAns_exact": 91.0681244743482,
"eval_NoAns_f1": 91.0681244743482,
"eval_NoAns_total": 5945,
"eval_best_exact": 87.95586625115808,
"eval_best_exact_thresh": 0.0,
"eval_best_f1": 90.77635490089573,
"eval_best_f1_thresh": 0.0,
"eval_exact": 87.95586625115808,
"eval_f1": 90.77635490089592,
"eval_runtime": 623.1333,
"eval_samples": 11951,
"eval_samples_per_second": 19.179,
"eval_steps_per_second": 0.799,
"eval_total": 11873
}
# Squad
{
"eval_exact_match": 89.29044465468307,
"eval_f1": 94.9846365606959,
"eval_runtime": 553.7132,
"eval_samples": 10618,
"eval_samples_per_second": 19.176,
"eval_steps_per_second": 0.8
}
Using with Peft
NOTE: This requires code in the PR https://github.com/huggingface/peft/pull/473 for the PEFT library.
#!pip install peft
from peft import LoraConfig, PeftModelForQuestionAnswering
from transformers import AutoModelForQuestionAnswering, AutoTokenizer
model_name = "sjrhuschlee/deberta-v3-large-squad2"
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 24
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 1
- total_train_batch_size: 24
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 4.0
LoRA Config
{
"base_model_name_or_path": "microsoft/deberta-v3-large",
"bias": "none",
"fan_in_fan_out": false,
"inference_mode": true,
"init_lora_weights": true,
"lora_alpha": 32,
"lora_dropout": 0.1,
"modules_to_save": ["qa_outputs"],
"peft_type": "LORA",
"r": 8,
"target_modules": [
"query_proj",
"key_proj",
"value_proj",
"dense"
],
"task_type": "QUESTION_ANS"
}
Framework versions
- Transformers 4.30.0.dev0
- Pytorch 2.0.1+cu117
- Datasets 2.12.0
- Tokenizers 0.13.3
- Downloads last month
- 307
Model tree for sjrhuschlee/deberta-v3-large-squad2
Datasets used to train sjrhuschlee/deberta-v3-large-squad2
rajpurkar/squad_v2
Evaluation results
- Exact Match on squad_v2validation set self-reported87.956
- F1 on squad_v2validation set self-reported90.781
- Exact Match on squadvalidation set self-reported89.290
- F1 on squadvalidation set self-reported95.008
- Exact Match on adversarial_qavalidation set self-reported41.400
- F1 on adversarial_qavalidation set self-reported55.676
- Exact Match on squad_adversarialvalidation set self-reported83.660
- F1 on squad_adversarialvalidation set self-reported89.451