lewtun/distilgpt2-finetuned-shakespeare

This model is a fine-tuned version of distilgpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 2.9411
  • Validation Loss: 3.5767
  • Epoch: 29

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
  • training_precision: float32

Training results

Train Loss Validation Loss Epoch
4.2112 3.8253 0
3.8997 3.6898 1
3.7783 3.6304 2
3.7046 3.5846 3
3.6477 3.5667 4
3.6001 3.5445 5
3.5563 3.5333 6
3.5198 3.5240 7
3.4842 3.5146 8
3.4505 3.5126 9
3.4184 3.5022 10
3.3912 3.5027 11
3.3613 3.5003 12
3.3337 3.4985 13
3.3045 3.5004 14
3.2772 3.5014 15
3.2527 3.5018 16
3.2274 3.5053 17
3.2011 3.5106 18
3.1754 3.5143 19
3.1512 3.5181 20
3.1259 3.5274 21
3.1003 3.5215 22
3.0809 3.5354 23
3.0568 3.5335 24
3.0306 3.5502 25
3.0080 3.5574 26
2.9857 3.5587 27
2.9654 3.5760 28
2.9411 3.5767 29

Framework versions

  • Transformers 4.22.2
  • TensorFlow 2.10.0
  • Datasets 2.5.2
  • Tokenizers 0.11.6
Downloads last month
7
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Space using lewtun/distilgpt2-finetuned-shakespeare 1