my_awesome_qa_model

This model is a fine-tuned version of distilbert/distilbert-base-uncased on the SQuAD dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7431

Model description

This is a DistilBERT model fine-tuned for extractive question answering on the SQuAD dataset. It extracts answers from a given context based on a question.

Intended uses & limitations

  • Intended uses: Suitable for research and applications requiring extractive QA, such as answering questions based on provided text (e.g., educational tools, chatbots).
  • Limitations: The model may struggle with out-of-domain data or complex reasoning. It was trained on SQuAD, so performance may vary on other datasets. No F1 or Exact Match metrics are available yet (see training procedure).

Training and evaluation data

The model was trained and evaluated on the SQuAD dataset, which contains questions and answers extracted from Wikipedia articles.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 250 2.6253
2.9024 2.0 500 1.8672
2.9024 3.0 750 1.7431

Framework versions

  • Transformers 4.56.2
  • Pytorch 2.8.0+cu126
  • Datasets 4.0.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
66.4M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Titembaye/my_awesome_qa_model

Finetuned
(10573)
this model

Dataset used to train Titembaye/my_awesome_qa_model

Evaluation results