my_awesome_qa_model
This model is a fine-tuned version of distilbert/distilbert-base-uncased on the SQuAD dataset. It achieves the following results on the evaluation set:
- Loss: 1.7431
Model description
This is a DistilBERT model fine-tuned for extractive question answering on the SQuAD dataset. It extracts answers from a given context based on a question.
Intended uses & limitations
- Intended uses: Suitable for research and applications requiring extractive QA, such as answering questions based on provided text (e.g., educational tools, chatbots).
- Limitations: The model may struggle with out-of-domain data or complex reasoning. It was trained on SQuAD, so performance may vary on other datasets. No F1 or Exact Match metrics are available yet (see training procedure).
Training and evaluation data
The model was trained and evaluated on the SQuAD dataset, which contains questions and answers extracted from Wikipedia articles.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 3
Training results
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| No log | 1.0 | 250 | 2.6253 |
| 2.9024 | 2.0 | 500 | 1.8672 |
| 2.9024 | 3.0 | 750 | 1.7431 |
Framework versions
- Transformers 4.56.2
- Pytorch 2.8.0+cu126
- Datasets 4.0.0
- Tokenizers 0.22.1
- Downloads last month
- -
Model tree for Titembaye/my_awesome_qa_model
Base model
distilbert/distilbert-base-uncasedDataset used to train Titembaye/my_awesome_qa_model
Evaluation results
- loss on squadself-reported1.743