gpt2-13K_NC_V4

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9979

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 41 3.3578
No log 2.0 82 3.0393
0.8629 3.0 123 2.7994
0.8629 4.0 164 2.6013
0.6962 5.0 205 2.4413
0.6962 6.0 246 2.3348
0.6962 7.0 287 2.2546
0.6088 8.0 328 2.2033
0.6088 9.0 369 2.1726
0.5725 10.0 410 2.1390
0.5725 11.0 451 2.1300
0.5725 12.0 492 2.1048
0.5556 13.0 533 2.0996
0.5556 14.0 574 2.0854
0.5428 15.0 615 2.0780
0.5428 16.0 656 2.0707
0.5428 17.0 697 2.0680
0.534 18.0 738 2.0577
0.534 19.0 779 2.0511
0.5294 20.0 820 2.0464
0.5294 21.0 861 2.0412
0.5239 22.0 902 2.0414
0.5239 23.0 943 2.0338
0.5239 24.0 984 2.0300
0.5221 25.0 1025 2.0279
0.5221 26.0 1066 2.0214
0.5164 27.0 1107 2.0210
0.5164 28.0 1148 2.0211
0.5164 29.0 1189 2.0170
0.5155 30.0 1230 2.0181
0.5155 31.0 1271 2.0138
0.5129 32.0 1312 2.0139
0.5129 33.0 1353 2.0070
0.5129 34.0 1394 2.0102
0.5116 35.0 1435 2.0063
0.5116 36.0 1476 2.0022
0.5115 37.0 1517 2.0008
0.5115 38.0 1558 2.0020
0.5115 39.0 1599 1.9979
0.5105 40.0 1640 1.9982
0.5105 41.0 1681 1.9981

Framework versions

  • PEFT 0.15.1
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu118
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for augustocsc/gpt2-13K_NC_V4

Adapter
(1651)
this model

Evaluation results