my_awesome_qa_model

This model is a fine-tuned version of distilbert/distilbert-base-uncased on the SQuAD dataset. It achieves the following results on the evaluation set:

Loss: 1.7431

Model description

This is a DistilBERT model fine-tuned for extractive question answering on the SQuAD dataset. It extracts answers from a given context based on a question.

Intended uses & limitations

Intended uses: Suitable for research and applications requiring extractive QA, such as answering questions based on provided text (e.g., educational tools, chatbots).
Limitations: The model may struggle with out-of-domain data or complex reasoning. It was trained on SQuAD, so performance may vary on other datasets. No F1 or Exact Match metrics are available yet (see training procedure).

Training and evaluation data

The model was trained and evaluated on the SQuAD dataset, which contains questions and answers extracted from Wikipedia articles.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 3

Training results

Training Loss	Epoch	Step	Validation Loss
No log	1.0	250	2.6253
2.9024	2.0	500	1.8672
2.9024	3.0	750	1.7431

Framework versions

Transformers 4.56.2
Pytorch 2.8.0+cu126
Datasets 4.0.0
Tokenizers 0.22.1

Downloads last month: -

Safetensors

Model size

66.4M params

Tensor type

F32

Model tree for Titembaye/my_awesome_qa_model

Base model

distilbert/distilbert-base-uncased

Finetuned

(11281)

this model

Dataset used to train Titembaye/my_awesome_qa_model

Evaluation results

loss on squad
self-reported

1.743