Se124M100KInfMinimalist

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5392

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.1691 1.0 1860 0.6314
0.1598 2.0 3720 0.6036
0.1539 3.0 5580 0.5906
0.153 4.0 7440 0.5836
0.1507 5.0 9300 0.5790
0.1483 6.0 11160 0.5746
0.149 7.0 13020 0.5703
0.1485 8.0 14880 0.5684
0.1462 9.0 16740 0.5656
0.1469 10.0 18600 0.5630
0.1449 11.0 20460 0.5617
0.1469 12.0 22320 0.5581
0.1456 13.0 24180 0.5575
0.1459 14.0 26040 0.5547
0.1432 15.0 27900 0.5544
0.1429 16.0 29760 0.5540
0.1431 17.0 31620 0.5523
0.1432 18.0 33480 0.5512
0.1423 19.0 35340 0.5519
0.1429 20.0 37200 0.5506
0.1429 21.0 39060 0.5490
0.1441 22.0 40920 0.5477
0.1426 23.0 42780 0.5476
0.1436 24.0 44640 0.5463
0.1419 25.0 46500 0.5462
0.1399 26.0 48360 0.5449
0.1412 27.0 50220 0.5452
0.14 28.0 52080 0.5440
0.1396 29.0 53940 0.5440
0.1402 30.0 55800 0.5440
0.1404 31.0 57660 0.5437
0.1415 32.0 59520 0.5427
0.1406 33.0 61380 0.5420
0.1387 34.0 63240 0.5422
0.1392 35.0 65100 0.5420
0.1404 36.0 66960 0.5420
0.1436 37.0 68820 0.5411
0.1424 38.0 70680 0.5415
0.141 39.0 72540 0.5407
0.1402 40.0 74400 0.5403
0.1412 41.0 76260 0.5407
0.139 42.0 78120 0.5403
0.1357 43.0 79980 0.5401
0.1396 44.0 81840 0.5397
0.1398 45.0 83700 0.5394
0.1385 46.0 85560 0.5395
0.1408 47.0 87420 0.5396
0.1371 48.0 89280 0.5392
0.1418 49.0 91140 0.5393
0.1382 50.0 93000 0.5392

Framework versions

  • PEFT 0.15.1
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu118
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for augustocsc/Se124M100KInfMinimalist

Adapter
(1651)
this model

Evaluation results