Update README.md

4dbb228 about 4 years ago

3.44 kB

tags:
  - generated_from_trainer
metrics:
  - f1
model-index:
  - name: DSPFirst-Finetuning-4
    results: []

DSPFirst-Finetuning-4

This model is a fine-tuned version of ahotrod/electra_large_discriminator_squad2_512 on a generated Questions and Answers dataset from the DSPFirst textbook based on the SQuAD 2.0 format.
It achieves the following results on the evaluation set:

Loss: 1.1113
Exact: 63.9013
F1: 72.1497

More accurate metrics:

Before fine-tuning:

  "exact": 56.2219730941704,
  "f1": 61.903777053610895

After fine-tuning:

  "exact": 64.01345291479821,
  "f1": 72.2551864039602

Dataset

A visualization of the dataset can be found here.
The split between train and test is 70% and 30% respectively.

DatasetDict({
    train: Dataset({
        features: ['id', 'title', 'context', 'question', 'answers'],
        num_rows: 4160
    })
    test: Dataset({
        features: ['id', 'title', 'context', 'question', 'answers'],
        num_rows: 1784
    })
})

Intended uses & limitations

This model is fine-tuned to answer questions from the DSPFirst textbook. I'm not really sure what I am doing so you should review before using it.
Also, you should improve the Dataset either by using a better generated questions and answers model (currently using https://github.com/patil-suraj/question_generation) or perform data augmentation to increase dataset size.

Training and evaluation data

batch_size of 6 results in 14.82 GB VRAM
Utilizes gradient_accumulation_steps to get total batch size to 514 (batch size should be at least 256)
4.52 GB RAM
30% of the total questions is dedicated for evaluating.

Training procedure

The model was trained from Google Colab
Utilizes Tesla P100 16GB, took 3.8 hours to train
load_best_model_at_end is enabled in TrainingArguments

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 6
eval_batch_size: 6
seed: 42
gradient_accumulation_steps: 86
total_train_batch_size: 516
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 6

Model hyperparameters

hidden_dropout_prob: 0.37
attention_probs_dropout_prob = 0.37

Training results

Training Loss	Epoch	Step	Validation Loss	Exact	F1
2.4306	0.81	20	1.1873	58.8004	67.0961
1.8178	1.64	40	1.1572	62.8924	71.2319
1.7696	2.48	60	1.0879	63.6771	71.6848
1.8313	3.32	80	1.1332	63.2848	71.8749
1.5811	4.16	100	1.0473	63.4529	71.6780
1.477	4.97	120	1.0720	64.1816	72.2297
1.5882	5.81	140	1.1113	63.9013	72.1497

Framework versions

Transformers 4.18.0
Pytorch 1.10.0+cu111
Datasets 2.1.0
Tokenizers 0.12.1