Update README.md
Browse files
README.md
CHANGED
|
@@ -3,44 +3,88 @@ library_name: transformers
|
|
| 3 |
license: apache-2.0
|
| 4 |
base_model: distilbert/distilbert-base-cased
|
| 5 |
tags:
|
| 6 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
model-index:
|
| 8 |
- name: a-question_answerer
|
| 9 |
results: []
|
|
|
|
|
|
|
| 10 |
---
|
| 11 |
|
| 12 |
-
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
| 13 |
-
should probably proofread and complete it, then remove this comment. -->
|
| 14 |
-
|
| 15 |
# a-question_answerer
|
| 16 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 17 |
This model is a fine-tuned version of [distilbert/distilbert-base-cased](https://huggingface.co/distilbert/distilbert-base-cased) on an unknown dataset.
|
| 18 |
It achieves the following results on the evaluation set:
|
| 19 |
-
- Loss:
|
| 20 |
|
| 21 |
## Model description
|
| 22 |
|
| 23 |
-
|
| 24 |
|
| 25 |
## Intended uses & limitations
|
| 26 |
|
| 27 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 28 |
|
| 29 |
## Training and evaluation data
|
| 30 |
|
| 31 |
-
|
|
|
|
|
|
|
|
|
|
| 32 |
|
| 33 |
## Training procedure
|
| 34 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
### Training hyperparameters
|
| 36 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 37 |
The following hyperparameters were used during training:
|
| 38 |
- learning_rate: 2e-05
|
| 39 |
- train_batch_size: 4
|
| 40 |
- eval_batch_size: 4
|
|
|
|
| 41 |
- seed: 42
|
| 42 |
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 43 |
- lr_scheduler_type: linear
|
|
|
|
|
|
|
|
|
|
|
|
|
| 44 |
- num_epochs: 3
|
| 45 |
|
| 46 |
### Training results
|
|
@@ -48,13 +92,24 @@ The following hyperparameters were used during training:
|
|
| 48 |
| Training Loss | Epoch | Step | Validation Loss |
|
| 49 |
|:-------------:|:-----:|:-----:|:---------------:|
|
| 50 |
| 1.1921 | 1.0 | 32580 | 1.4150 |
|
| 51 |
-
| 0.9637 | 2.0 | 65160 | 1.3622
|
| 52 |
| 0.6474 | 3.0 | 97740 | 1.8661 |
|
| 53 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 54 |
|
| 55 |
### Framework versions
|
| 56 |
|
| 57 |
- Transformers 4.55.0
|
| 58 |
- Pytorch 2.6.0+cu124
|
| 59 |
- Datasets 4.0.0
|
| 60 |
-
- Tokenizers 0.21.4
|
|
|
|
| 3 |
license: apache-2.0
|
| 4 |
base_model: distilbert/distilbert-base-cased
|
| 5 |
tags:
|
| 6 |
+
- question-answering
|
| 7 |
+
- squadv2
|
| 8 |
+
- distilbert
|
| 9 |
+
- en
|
| 10 |
+
- transformer
|
| 11 |
+
- pytorch
|
| 12 |
model-index:
|
| 13 |
- name: a-question_answerer
|
| 14 |
results: []
|
| 15 |
+
language:
|
| 16 |
+
- en
|
| 17 |
---
|
| 18 |
|
|
|
|
|
|
|
|
|
|
| 19 |
# a-question_answerer
|
| 20 |
|
| 21 |
+
This model is a fine-tuned version of the DistilBERT Base Cased model on the SQuAD v2 dataset for the Question Answering task.
|
| 22 |
+
|
| 23 |
+
It was trained as part of a Google Colab project aimed at adapting a pre-trained language model to answer questions based on a given text context.
|
| 24 |
+
|
| 25 |
This model is a fine-tuned version of [distilbert/distilbert-base-cased](https://huggingface.co/distilbert/distilbert-base-cased) on an unknown dataset.
|
| 26 |
It achieves the following results on the evaluation set:
|
| 27 |
+
- Loss: 1.3622*
|
| 28 |
|
| 29 |
## Model description
|
| 30 |
|
| 31 |
+
This model is intended for use in Question Answering tasks, where the goal is to extract a concise answer span from a provided text context given a natural language question. It can handle both answerable and unanswerable questions as per the SQuAD v2 dataset format.
|
| 32 |
|
| 33 |
## Intended uses & limitations
|
| 34 |
|
| 35 |
+
Potential use cases include:
|
| 36 |
+
|
| 37 |
+
Building a simple document Q-A system.
|
| 38 |
+
Enhancing search functionalities to provide direct answers.
|
| 39 |
+
|
| 40 |
+
As with any model trained on a specific dataset, this model's performance is influenced by the characteristics and potential biases present in the SQuAD v2 dataset.
|
| 41 |
+
It may perform differently on text from domains significantly different from Wikipedia articles (the source of SQuAD).
|
| 42 |
+
The model may also inherit biases from the original DistilBERT Base Cased model.
|
| 43 |
+
|
| 44 |
+
The model's performance on identifying and answering questions depends heavily on the quality and relevance of the provided context.
|
| 45 |
|
| 46 |
## Training and evaluation data
|
| 47 |
|
| 48 |
+
The model was fine-tuned on the SQuAD v2 dataset, which contains over 130,000 question-answer pairs derived from Wikipedia articles.
|
| 49 |
+
The dataset includes questions that are unanswerable, requiring the model to determine if no answer exists within the provided text.
|
| 50 |
+
|
| 51 |
+
For the final reported results, the model was trained on the full SQuAD v2 training dataset.
|
| 52 |
|
| 53 |
## Training procedure
|
| 54 |
|
| 55 |
+
The model was fine-tuned using the Hugging Face transformers library and Trainer API.
|
| 56 |
+
The training process involved tokenizing the dataset, preparing input features with start and end positions for answers, and using DataCollatorWithPadding.
|
| 57 |
+
Early stopping was used to load the model checkpoint with the lowest validation loss.
|
| 58 |
+
|
| 59 |
+
Training Arguments:
|
| 60 |
+
|
| 61 |
+
Learning Rate: 2e-5
|
| 62 |
+
Per Device Train Batch Size: 4
|
| 63 |
+
Per Device Eval Batch Size: 4
|
| 64 |
+
Number of Epochs: 3
|
| 65 |
+
Weight Decay: 0.1
|
| 66 |
+
Evaluation Strategy: epoch
|
| 67 |
+
Save Strategy: epoch
|
| 68 |
+
Early Stopping: Enabled (load_best_model_at_end=True, metric_for_best_model="eval_loss")
|
| 69 |
+
|
| 70 |
### Training hyperparameters
|
| 71 |
|
| 72 |
+
Base Model: distilbert/distilbert-base-cased
|
| 73 |
+
Dataset: SQuAD v2
|
| 74 |
+
Early Stopping: Enabled (load_best_model_at_end=True, metric_for_best_model="eval_loss")
|
| 75 |
+
|
| 76 |
The following hyperparameters were used during training:
|
| 77 |
- learning_rate: 2e-05
|
| 78 |
- train_batch_size: 4
|
| 79 |
- eval_batch_size: 4
|
| 80 |
+
- weight_decay: 0.1
|
| 81 |
- seed: 42
|
| 82 |
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 83 |
- lr_scheduler_type: linear
|
| 84 |
+
- eval_strategy: epoch
|
| 85 |
+
- save_strategy: epoch
|
| 86 |
+
- load_best_model_at_end=True
|
| 87 |
+
- metric_for_best_model="eval_loss"
|
| 88 |
- num_epochs: 3
|
| 89 |
|
| 90 |
### Training results
|
|
|
|
| 92 |
| Training Loss | Epoch | Step | Validation Loss |
|
| 93 |
|:-------------:|:-----:|:-----:|:---------------:|
|
| 94 |
| 1.1921 | 1.0 | 32580 | 1.4150 |
|
| 95 |
+
| 0.9637 | 2.0 | 65160 | 1.3622* |
|
| 96 |
| 0.6474 | 3.0 | 97740 | 1.8661 |
|
| 97 |
|
| 98 |
+
## Evaluation Results
|
| 99 |
+
|
| 100 |
+
The model was evaluated on the SQuAD v2 validation set. The following metrics were obtained:
|
| 101 |
+
|
| 102 |
+
| Metric | Overall | Answerable (HasAns) | Unanswerable (NoAns) |
|
| 103 |
+
|----------------|-----------|---------------------|----------------------|
|
| 104 |
+
| Exact match-EM | 64.2 | 60.27 | 67.97 |
|
| 105 |
+
| F1 Score | 66.57 | 65.10 | 67.97 |
|
| 106 |
+
| Total Examples | 2000 | 979 | 1021 |
|
| 107 |
+
|
| 108 |
+
*Note: The metrics for 'Answerable' and 'Unanswerable' questions provide a more detailed view of the model's performance on each type of question in SQuAD v2.*
|
| 109 |
|
| 110 |
### Framework versions
|
| 111 |
|
| 112 |
- Transformers 4.55.0
|
| 113 |
- Pytorch 2.6.0+cu124
|
| 114 |
- Datasets 4.0.0
|
| 115 |
+
- Tokenizers 0.21.4
|