Fine-tuned MiniLM on SQuAD v2 – Question Answering

This model is a fine-tuned version of deepset/minilm-uncased-squad2 on the SQuAD v2 dataset for extractive question answering.

Trained on a subset of 20,000 training examples and evaluated on 2,000 validation examples.

Model Details

  • Base model: deepset/minilm-uncased-squad2
  • Language: English
  • License: Apache 2.0
  • Developed by: qusai00
  • Finetuned from: deepset/minilm-uncased-squad2 (already pre-trained on SQuAD)
  • Training data: SQuAD v2 (20k train subset)
  • Evaluation data: SQuAD v2 validation subset (2k examples)

Quick Usage

from transformers import pipeline

qa = pipeline("question-answering", model="qusai00/minilm-squad2-finetuned-v1")

context = "The Virgin Mary allegedly appeared to Saint Bernadette Soubirous in 1858 in Lourdes, France."
question = "To whom did the Virgin Mary appear in 1858?"

result = qa(question=question, context=context)
print(result)
# Example output:
# {'score': 0.9919, 'start': 47, 'end': 64, 'answer': 'Saint Bernadette'}


=======================================================
=======================================================

Training Setup

Framework: Hugging Face Transformers
Hardware: NVIDIA RTX 4080, 64GB RAM
Precision: fp16 mixed precision
Hyperparameters:
Learning rate: 3e-5
Batch size: 16
Epochs: 35 (with early stopping)
Weight decay: 0.01
Max sequence length: 384
Stride: 128

=======================================================
=======================================================

Performance
Qualitative results (manual evaluation):

High-confidence answers on factual questions
Example: Historical and geographical questions → correct short answers with confidence > 0.95
Minor issues: Occasionally adds small extra words (e.g., "Paris is" instead of "Paris") — common in small distilled models

Quantitative metrics (F1 / Exact Match) not computed in final run due to evaluation setup, but manual testing shows strong performance for the training size.
Intended Use & Limitations

Intended: Extractive QA on English factual text (answer span inside context).
Not intended for: Generative answers, non-English text, very long documents, open-domain QA.
Limitations:
Trained on subset only → may struggle with rare topics or very complex contexts
Small distilled model → occasional minor word additions or misses

=======================================================
=======================================================

Citation
bibtex@misc{qusai00-minilm-squad2,
  author       = {Qusai},
  title        = {Fine-tuned MiniLM on SQuAD v2 for Question Answering},
  year         = {2026},
  publisher    = {Hugging Face},
  howpublished = {\url{https://huggingface.co/qusai00/minilm-squad2-finetuned-v1}}
}


Thank you for using the model!
Feedback, questions, or suggestions are welcome.
Downloads last month
2
Safetensors
Model size
33.2M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support