Se124M10KInfPrompt

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7128

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.4014 1.0 267 1.0141
0.2422 2.0 534 0.8523
0.2202 3.0 801 0.8168
0.2129 4.0 1068 0.7993
0.2059 5.0 1335 0.7837
0.2041 6.0 1602 0.7695
0.2031 7.0 1869 0.7635
0.1982 8.0 2136 0.7586
0.1975 9.0 2403 0.7532
0.1974 10.0 2670 0.7483
0.1978 11.0 2937 0.7467
0.1939 12.0 3204 0.7445
0.1953 13.0 3471 0.7439
0.1929 14.0 3738 0.7362
0.1937 15.0 4005 0.7328
0.1934 16.0 4272 0.7329
0.1927 17.0 4539 0.7323
0.1927 18.0 4806 0.7257
0.1909 19.0 5073 0.7276
0.1919 20.0 5340 0.7251
0.1919 21.0 5607 0.7239
0.1912 22.0 5874 0.7260
0.1897 23.0 6141 0.7241
0.1916 24.0 6408 0.7235
0.1905 25.0 6675 0.7225
0.1919 26.0 6942 0.7188
0.1883 27.0 7209 0.7207
0.1898 28.0 7476 0.7198
0.1874 29.0 7743 0.7195
0.188 30.0 8010 0.7194
0.1873 31.0 8277 0.7182
0.1878 32.0 8544 0.7212
0.1866 33.0 8811 0.7171
0.1883 34.0 9078 0.7151
0.1881 35.0 9345 0.7176
0.1868 36.0 9612 0.7149
0.1871 37.0 9879 0.7157
0.1876 38.0 10146 0.7162
0.188 39.0 10413 0.7142
0.1861 40.0 10680 0.7149
0.1862 41.0 10947 0.7144
0.1862 42.0 11214 0.7128
0.186 43.0 11481 0.7136
0.1868 44.0 11748 0.7137
0.1837 45.0 12015 0.7138
0.1868 46.0 12282 0.7141
0.187 47.0 12549 0.7133

Framework versions

  • PEFT 0.15.1
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu118
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for augustocsc/Se124M10KInfPrompt

Adapter
(1651)
this model

Evaluation results