| | --- |
| | tags: |
| | - generated_from_trainer |
| | metrics: |
| | - f1 |
| | model-index: |
| | - name: DSPFirst-Finetuning-4 |
| | results: [] |
| | --- |
| | |
| | <!-- This model card has been generated automatically according to the information the Trainer had access to. You |
| | should probably proofread and complete it, then remove this comment. --> |
| |
|
| | # Important Note: |
| | `load_best_model_at_end` is not working properly (I specified `metric_for_best_model` on another training but it still does not work), but the training results still show a valid trend. |
| |
|
| | # DSPFirst-Finetuning-4 |
| |
|
| | This model is a fine-tuned version of [ahotrod/electra_large_discriminator_squad2_512](https://huggingface.co/ahotrod/electra_large_discriminator_squad2_512) on a generated Questions and Answers dataset from the DSPFirst textbook based on the SQuAD 2.0 format.<br /> |
| | It achieves the following results on the evaluation set: |
| | - Loss: 0.9028 |
| | - Exact: 66.9843 |
| | - F1: 74.2286 |
| |
|
| | ## More accurate metrics: |
| |
|
| | ### Before fine-tuning: |
| |
|
| | ``` |
| | "exact": 57.006726457399104, |
| | "f1": 61.997705120754276 |
| | ``` |
| |
|
| | ### After fine-tuning: |
| |
|
| | ``` |
| | "exact": 66.98430493273543, |
| | "f1": 74.2285867775556 |
| | ``` |
| |
|
| | # Dataset |
| | A visualization of the dataset can be found [here](https://github.gatech.edu/pages/VIP-ITS/textbook_SQuAD_explore/explore/textbookv1.0/textbook/).<br /> |
| | The split between train and test is 70% and 30% respectively. |
| | ``` |
| | DatasetDict({ |
| | train: Dataset({ |
| | features: ['id', 'title', 'context', 'question', 'answers'], |
| | num_rows: 4160 |
| | }) |
| | test: Dataset({ |
| | features: ['id', 'title', 'context', 'question', 'answers'], |
| | num_rows: 1784 |
| | }) |
| | }) |
| | ``` |
| |
|
| | ## Intended uses & limitations |
| |
|
| | This model is fine-tuned to answer questions from the DSPFirst textbook. I'm not really sure what I am doing so you should review before using it.<br /> |
| | Also, you should improve the Dataset either by using a **better generated questions and answers model** (currently using https://github.com/patil-suraj/question_generation) or perform **data augmentation** to increase dataset size. |
| | |
| | ## Training and evaluation data |
| | |
| | - `batch_size` of 6 results in 14.82 GB VRAM |
| | - Utilizes `gradient_accumulation_steps` to get total batch size to 514 (batch size should be at least 256) |
| | - 4.52 GB RAM |
| | - 30% of the total questions is dedicated for evaluating. |
| |
|
| | ## Training procedure |
| | - The model was trained from [Google Colab](https://colab.research.google.com/drive/1dJXNstk2NSenwzdtl9xA8AqjP4LL-Ks_?usp=sharing) |
| | - Utilizes Tesla P100 16GB, took 6.3 hours to train |
| | - `load_best_model_at_end` is enabled in TrainingArguments |
| |
|
| | ### Training hyperparameters |
| |
|
| | The following hyperparameters were used during training: |
| | - learning_rate: 2e-05 |
| | - train_batch_size: 6 |
| | - eval_batch_size: 6 |
| | - seed: 42 |
| | - gradient_accumulation_steps: 86 |
| | - total_train_batch_size: 516 |
| | - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
| | - lr_scheduler_type: linear |
| | - num_epochs: 10 |
| | |
| | ### Model hyperparameters |
| | |
| | - hidden_dropout_prob: 0.36 |
| | - attention_probs_dropout_prob = 0.36 |
| |
|
| | ### Training results |
| |
|
| | | Training Loss | Epoch | Step | Validation Loss | Exact | F1 | |
| | |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:| |
| | | 2.4411 | 0.81 | 20 | 1.4556 | 62.0516 | 71.1082 | |
| | | 2.2027 | 1.64 | 40 | 1.1508 | 65.0224 | 73.8669 | |
| | | 1.2827 | 2.48 | 60 | 1.0030 | 65.8632 | 74.3959 | |
| | | 1.0925 | 3.32 | 80 | 1.0155 | 66.8722 | 75.2204 | |
| | | 1.03 | 4.16 | 100 | 0.8863 | 66.1996 | 73.8166 | |
| | | 0.9085 | 4.97 | 120 | 0.9675 | 67.9372 | 75.7764 | |
| | | 0.8968 | 5.81 | 140 | 0.8635 | 67.2085 | 74.3725 | |
| | | 0.8867 | 6.64 | 160 | 0.9035 | 65.9753 | 73.4569 | |
| | | 0.8456 | 7.48 | 180 | 0.9098 | 67.2085 | 74.6798 | |
| | | 0.8506 | 8.32 | 200 | 0.8807 | 66.6480 | 74.2903 | |
| | | 0.7972 | 9.16 | 220 | 0.8711 | 66.6480 | 73.5801 | |
| | | 0.7795 | 9.97 | 240 | 0.9028 | 66.9843 | 74.2286 | |
| |
|
| |
|
| | ### Framework versions |
| |
|
| | - Transformers 4.18.0 |
| | - Pytorch 1.10.0+cu111 |
| | - Datasets 2.1.0 |
| | - Tokenizers 0.12.1 |
| |
|