DistilBERT Fine-Tuned on SQuAD for Extractive QA

Model Description

DistilBERT base uncased fine-tuned on a 5,000-sample subset of SQuAD for extractive question answering.

Training Details

  • Base model: distilbert-base-uncased
  • Dataset: SQuAD (5,000 samples, 80/20 train/test split)
  • Epochs: 3
  • Learning rate: 2e-5
  • Batch size: 16
  • Training loss: 2.303
  • Device: Apple Silicon (MPS)

Usage

from transformers import AutoTokenizer, AutoModelForQuestionAnswering
import torch

tokenizer = AutoTokenizer.from_pretrained("John-Machado/distilbert-squad-qa")
model = AutoModelForQuestionAnswering.from_pretrained("John-Machado/distilbert-squad-qa")

question = "How many official league titles has Juventus won?"
context = "The club has won 36 official league titles, 14 Coppa Italia titles and nine Supercoppa Italiana titles."

inputs = tokenizer(question, context, return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)

start = outputs.start_logits.argmax()
end = outputs.end_logits.argmax()
answer = tokenizer.decode(inputs.input_ids[0, start : end + 1])
print(answer)  # "36"

Limitations

  • Trained on only 5,000 SQuAD samples (out of 87,599) as a learning exercise
  • Answers must exist as a verbatim substring in the context
  • Cannot synthesize answers across multiple passages
Downloads last month
82
Safetensors
Model size
66.4M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for John-Machado/distilbert-squad-qa

Finetuned
(11154)
this model

Dataset used to train John-Machado/distilbert-squad-qa