Important Note:

load_best_model_at_end is not working properly (I specified metric_for_best_model on another training but it still does not work), but the training results still show a valid trend.

DSPFirst-Finetuning-4

This model is a fine-tuned version of ahotrod/electra_large_discriminator_squad2_512 on a generated Questions and Answers dataset from the DSPFirst textbook based on the SQuAD 2.0 format.
It achieves the following results on the evaluation set:

  • Loss: 0.9028
  • Exact: 66.9843
  • F1: 74.2286

More accurate metrics:

Before fine-tuning:

  "exact": 57.006726457399104,
  "f1": 61.997705120754276

After fine-tuning:

  "exact": 66.98430493273543,
  "f1": 74.2285867775556

Dataset

A visualization of the dataset can be found here.
The split between train and test is 70% and 30% respectively.

DatasetDict({
    train: Dataset({
        features: ['id', 'title', 'context', 'question', 'answers'],
        num_rows: 4160
    })
    test: Dataset({
        features: ['id', 'title', 'context', 'question', 'answers'],
        num_rows: 1784
    })
})

Intended uses & limitations

This model is fine-tuned to answer questions from the DSPFirst textbook. I'm not really sure what I am doing so you should review before using it.
Also, you should improve the Dataset either by using a better generated questions and answers model (currently using https://github.com/patil-suraj/question_generation) or perform data augmentation to increase dataset size.

Training and evaluation data

  • batch_size of 6 results in 14.82 GB VRAM
  • Utilizes gradient_accumulation_steps to get total batch size to 514 (batch size should be at least 256)
  • 4.52 GB RAM
  • 30% of the total questions is dedicated for evaluating.

Training procedure

  • The model was trained from Google Colab
  • Utilizes Tesla P100 16GB, took 6.3 hours to train
  • load_best_model_at_end is enabled in TrainingArguments

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 6
  • eval_batch_size: 6
  • seed: 42
  • gradient_accumulation_steps: 86
  • total_train_batch_size: 516
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Model hyperparameters

  • hidden_dropout_prob: 0.36
  • attention_probs_dropout_prob = 0.36

Training results

Training Loss Epoch Step Validation Loss Exact F1
2.4411 0.81 20 1.4556 62.0516 71.1082
2.2027 1.64 40 1.1508 65.0224 73.8669
1.2827 2.48 60 1.0030 65.8632 74.3959
1.0925 3.32 80 1.0155 66.8722 75.2204
1.03 4.16 100 0.8863 66.1996 73.8166
0.9085 4.97 120 0.9675 67.9372 75.7764
0.8968 5.81 140 0.8635 67.2085 74.3725
0.8867 6.64 160 0.9035 65.9753 73.4569
0.8456 7.48 180 0.9098 67.2085 74.6798
0.8506 8.32 200 0.8807 66.6480 74.2903
0.7972 9.16 220 0.8711 66.6480 73.5801
0.7795 9.97 240 0.9028 66.9843 74.2286

Framework versions

  • Transformers 4.18.0
  • Pytorch 1.10.0+cu111
  • Datasets 2.1.0
  • Tokenizers 0.12.1
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support