DistilBERT Fine-Tuned on SQuAD for Extractive QA
Model Description
DistilBERT base uncased fine-tuned on a 5,000-sample subset of SQuAD for extractive question answering.
Training Details
- Base model: distilbert-base-uncased
- Dataset: SQuAD (5,000 samples, 80/20 train/test split)
- Epochs: 3
- Learning rate: 2e-5
- Batch size: 16
- Training loss: 2.303
- Device: Apple Silicon (MPS)
Usage
from transformers import AutoTokenizer, AutoModelForQuestionAnswering
import torch
tokenizer = AutoTokenizer.from_pretrained("John-Machado/distilbert-squad-qa")
model = AutoModelForQuestionAnswering.from_pretrained("John-Machado/distilbert-squad-qa")
question = "How many official league titles has Juventus won?"
context = "The club has won 36 official league titles, 14 Coppa Italia titles and nine Supercoppa Italiana titles."
inputs = tokenizer(question, context, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
start = outputs.start_logits.argmax()
end = outputs.end_logits.argmax()
answer = tokenizer.decode(inputs.input_ids[0, start : end + 1])
print(answer) # "36"
Limitations
- Trained on only 5,000 SQuAD samples (out of 87,599) as a learning exercise
- Answers must exist as a verbatim substring in the context
- Cannot synthesize answers across multiple passages
- Downloads last month
- 82
Model tree for John-Machado/distilbert-squad-qa
Base model
distilbert/distilbert-base-uncased