mitre-gpt2-base / README.md
bencyc1129's picture
Model save
a5cd9fd verified
metadata
license: mit
base_model: gpt2
tags:
  - generated_from_trainer
model-index:
  - name: mitre-gpt2-base
    results: []

mitre-gpt2-base

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.3404

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
3.2933 2.72 1000 2.6028
2.2619 5.45 2000 2.4654
1.9152 8.17 3000 2.3952
1.6434 10.9 4000 2.3729
1.4289 13.62 5000 2.4208
1.2627 16.35 6000 2.4845
1.1301 19.07 7000 2.5619
1.0169 21.8 8000 2.6058
0.93 24.52 9000 2.6773
0.8587 27.25 10000 2.7389
0.8032 29.97 11000 2.7639
0.7506 32.7 12000 2.8329
0.7079 35.42 13000 2.8934
0.6781 38.15 14000 2.9175
0.6461 40.87 15000 2.9532
0.6205 43.6 16000 3.0008
0.5987 46.32 17000 3.0539
0.5811 49.05 18000 3.0738
0.564 51.77 19000 3.0972
0.5491 54.5 20000 3.1341
0.5377 57.22 21000 3.1558
0.5255 59.95 22000 3.1723
0.516 62.67 23000 3.1984
0.5077 65.4 24000 3.2163
0.5021 68.12 25000 3.2396
0.4946 70.84 26000 3.2413
0.4871 73.57 27000 3.2708
0.4845 76.29 28000 3.2833
0.4791 79.02 29000 3.2847
0.4739 81.74 30000 3.2950
0.4704 84.47 31000 3.3124
0.4678 87.19 32000 3.3122
0.4642 89.92 33000 3.3260
0.4617 92.64 34000 3.3326
0.4605 95.37 35000 3.3325
0.4576 98.09 36000 3.3404

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2