DistilBERT Fine-Tuned on SQuAD for Extractive QA

Model Description

DistilBERT base uncased fine-tuned on a 5,000-sample subset of SQuAD for extractive question answering.

Training Details

Base model: distilbert-base-uncased
Dataset: SQuAD (5,000 samples, 80/20 train/test split)
Epochs: 3
Learning rate: 2e-5
Batch size: 16
Training loss: 2.303
Device: Apple Silicon (MPS)

Usage

from transformers import AutoTokenizer, AutoModelForQuestionAnswering
import torch

tokenizer = AutoTokenizer.from_pretrained("John-Machado/distilbert-squad-qa")
model = AutoModelForQuestionAnswering.from_pretrained("John-Machado/distilbert-squad-qa")

question = "How many official league titles has Juventus won?"
context = "The club has won 36 official league titles, 14 Coppa Italia titles and nine Supercoppa Italiana titles."

inputs = tokenizer(question, context, return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)

start = outputs.start_logits.argmax()
end = outputs.end_logits.argmax()
answer = tokenizer.decode(inputs.input_ids[0, start : end + 1])
print(answer)  # "36"

Limitations

Trained on only 5,000 SQuAD samples (out of 87,599) as a learning exercise
Answers must exist as a verbatim substring in the context
Cannot synthesize answers across multiple passages

Downloads last month: 4

Safetensors

Model size

66.4M params

Tensor type

F32

Model tree for John-Machado/distilbert-squad-qa

Base model

distilbert/distilbert-base-uncased

Finetuned

(11578)

this model

John-Machado
/

distilbert-squad-qa

DistilBERT Fine-Tuned on SQuAD for Extractive QA

Model Description

Training Details

Usage

Limitations

Model tree for John-Machado/distilbert-squad-qa

Dataset used to train John-Machado/distilbert-squad-qa