Se124M100KInfSimple

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4582

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.1445 1.0 2205 0.5442
0.1358 2.0 4410 0.5179
0.1317 3.0 6615 0.5090
0.1297 4.0 8820 0.5015
0.1299 5.0 11025 0.4954
0.1301 6.0 13230 0.4917
0.1258 7.0 15435 0.4875
0.1254 8.0 17640 0.4834
0.1231 9.0 19845 0.4816
0.1254 10.0 22050 0.4798
0.125 11.0 24255 0.4778
0.1225 12.0 26460 0.4775
0.1233 13.0 28665 0.4753
0.1213 14.0 30870 0.4737
0.1231 15.0 33075 0.4719
0.1233 16.0 35280 0.4716
0.1225 17.0 37485 0.4702
0.1218 18.0 39690 0.4696
0.1213 19.0 41895 0.4678
0.1213 20.0 44100 0.4673
0.121 21.0 46305 0.4675
0.122 22.0 48510 0.4663
0.1195 23.0 50715 0.4657
0.1221 24.0 52920 0.4647
0.1212 25.0 55125 0.4647
0.121 26.0 57330 0.4640
0.1213 27.0 59535 0.4637
0.1184 28.0 61740 0.4629
0.12 29.0 63945 0.4627
0.1191 30.0 66150 0.4622
0.1195 31.0 68355 0.4624
0.1188 32.0 70560 0.4619
0.1202 33.0 72765 0.4620
0.119 34.0 74970 0.4605
0.1206 35.0 77175 0.4608
0.1197 36.0 79380 0.4601
0.1199 37.0 81585 0.4597
0.1204 38.0 83790 0.4601
0.1185 39.0 85995 0.4596
0.1184 40.0 88200 0.4591
0.119 41.0 90405 0.4594
0.1181 42.0 92610 0.4591
0.1178 43.0 94815 0.4588
0.1188 44.0 97020 0.4586
0.1189 45.0 99225 0.4584
0.1183 46.0 101430 0.4583
0.1184 47.0 103635 0.4582
0.1185 48.0 105840 0.4581
0.1198 49.0 108045 0.4582
0.1207 50.0 110250 0.4582

Framework versions

  • PEFT 0.15.1
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu118
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for augustocsc/Se124M100KInfSimple

Adapter
(1651)
this model

Evaluation results