model2 / README.md
pradeep4321's picture
Training in progress epoch 21
f60ff52
|
raw
history blame
2.46 kB
metadata
license: mit
tags:
  - generated_from_keras_callback
model-index:
  - name: pradeep4321/model2
    results: []

pradeep4321/model2

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 5.9225
  • Validation Loss: 7.0365
  • Epoch: 21

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 2e-05, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 800, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'passive_serialization': True}, 'warmup_steps': 200, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
  • training_precision: mixed_float16

Training results

Train Loss Validation Loss Epoch
7.2465 7.5143 0
7.1890 7.4800 1
7.1104 7.4496 2
7.0491 7.4218 3
7.0064 7.3964 4
6.9428 7.3732 5
6.9061 7.3505 6
6.8538 7.3301 7
6.7857 7.3108 8
6.7253 7.2893 9
6.6743 7.2693 10
6.5944 7.2491 11
6.5499 7.2288 12
6.4767 7.2084 13
6.4145 7.1887 14
6.3713 7.1664 15
6.2863 7.1450 16
6.2017 7.1229 17
6.1524 7.1017 18
6.0841 7.0788 19
5.9643 7.0540 20
5.9225 7.0365 21

Framework versions

  • Transformers 4.29.0
  • TensorFlow 2.12.0
  • Datasets 2.12.0
  • Tokenizers 0.13.3