ptran74's picture
Update README.md
4dbb228
|
raw
history blame
3.44 kB
metadata
tags:
  - generated_from_trainer
metrics:
  - f1
model-index:
  - name: DSPFirst-Finetuning-4
    results: []

DSPFirst-Finetuning-4

This model is a fine-tuned version of ahotrod/electra_large_discriminator_squad2_512 on a generated Questions and Answers dataset from the DSPFirst textbook based on the SQuAD 2.0 format.
It achieves the following results on the evaluation set:

  • Loss: 1.1113
  • Exact: 63.9013
  • F1: 72.1497

More accurate metrics:

Before fine-tuning:

  "exact": 56.2219730941704,
  "f1": 61.903777053610895

After fine-tuning:

  "exact": 64.01345291479821,
  "f1": 72.2551864039602

Dataset

A visualization of the dataset can be found here.
The split between train and test is 70% and 30% respectively.

DatasetDict({
    train: Dataset({
        features: ['id', 'title', 'context', 'question', 'answers'],
        num_rows: 4160
    })
    test: Dataset({
        features: ['id', 'title', 'context', 'question', 'answers'],
        num_rows: 1784
    })
})

Intended uses & limitations

This model is fine-tuned to answer questions from the DSPFirst textbook. I'm not really sure what I am doing so you should review before using it.
Also, you should improve the Dataset either by using a better generated questions and answers model (currently using https://github.com/patil-suraj/question_generation) or perform data augmentation to increase dataset size.

Training and evaluation data

  • batch_size of 6 results in 14.82 GB VRAM
  • Utilizes gradient_accumulation_steps to get total batch size to 514 (batch size should be at least 256)
  • 4.52 GB RAM
  • 30% of the total questions is dedicated for evaluating.

Training procedure

  • The model was trained from Google Colab
  • Utilizes Tesla P100 16GB, took 3.8 hours to train
  • load_best_model_at_end is enabled in TrainingArguments

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 6
  • eval_batch_size: 6
  • seed: 42
  • gradient_accumulation_steps: 86
  • total_train_batch_size: 516
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 6

Model hyperparameters

  • hidden_dropout_prob: 0.37
  • attention_probs_dropout_prob = 0.37

Training results

Training Loss Epoch Step Validation Loss Exact F1
2.4306 0.81 20 1.1873 58.8004 67.0961
1.8178 1.64 40 1.1572 62.8924 71.2319
1.7696 2.48 60 1.0879 63.6771 71.6848
1.8313 3.32 80 1.1332 63.2848 71.8749
1.5811 4.16 100 1.0473 63.4529 71.6780
1.477 4.97 120 1.0720 64.1816 72.2297
1.5882 5.81 140 1.1113 63.9013 72.1497

Framework versions

  • Transformers 4.18.0
  • Pytorch 1.10.0+cu111
  • Datasets 2.1.0
  • Tokenizers 0.12.1