d431bd02a55bbd85866f1939f367ddca

This model is a fine-tuned version of google-t5/t5-small on the Helsinki-NLP/opus_books dataset. It achieves the following results on the evaluation set:

  • Loss: 2.3415
  • Data Size: 1.0
  • Epoch Runtime: 6.6882
  • Bleu: 3.0177

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 4.6221 0 1.1473 0.1100
No log 1 31 4.6064 0.0078 1.4740 0.1104
No log 2 62 4.5703 0.0156 1.4328 0.1134
No log 3 93 4.5005 0.0312 1.4964 0.1159
No log 4 124 4.4422 0.0625 1.5981 0.0962
No log 5 155 4.2707 0.125 2.1272 0.1219
No log 6 186 4.0733 0.25 2.6881 0.1684
0.7013 7 217 3.8242 0.5 3.7759 0.2088
0.7013 8.0 248 3.5560 1.0 6.4531 0.1758
2.8141 9.0 279 3.3816 1.0 6.2527 0.2138
3.74 10.0 310 3.2521 1.0 6.9367 0.2372
3.74 11.0 341 3.1519 1.0 6.7399 0.9490
3.5558 12.0 372 3.0793 1.0 6.1199 1.0051
3.4043 13.0 403 3.0140 1.0 6.2391 1.2950
3.4043 14.0 434 2.9645 1.0 7.1239 1.2945
3.3021 15.0 465 2.9170 1.0 6.6573 1.4390
3.3021 16.0 496 2.8817 1.0 6.3254 1.5021
3.2109 17.0 527 2.8456 1.0 6.2254 1.5389
3.1266 18.0 558 2.8140 1.0 7.1893 1.6168
3.1266 19.0 589 2.7836 1.0 6.9004 1.7308
3.0775 20.0 620 2.7528 1.0 6.6448 1.7620
3.0026 21.0 651 2.7238 1.0 6.4260 1.8896
3.0026 22.0 682 2.7012 1.0 5.1268 2.0290
2.9563 23.0 713 2.6767 1.0 5.4232 2.1725
2.9563 24.0 744 2.6560 1.0 5.7317 2.2300
2.891 25.0 775 2.6368 1.0 5.3370 2.2402
2.8587 26.0 806 2.6135 1.0 5.3657 2.2324
2.8587 27.0 837 2.5934 1.0 5.2370 2.3208
2.8015 28.0 868 2.5734 1.0 5.3508 2.3066
2.8015 29.0 899 2.5603 1.0 6.2737 2.4315
2.7561 30.0 930 2.5431 1.0 6.4207 2.4338
2.7188 31.0 961 2.5317 1.0 6.1300 2.4899
2.7188 32.0 992 2.5151 1.0 5.5575 2.5777
2.6827 33.0 1023 2.5060 1.0 5.7487 2.5721
2.6416 34.0 1054 2.4929 1.0 5.5721 2.6315
2.6416 35.0 1085 2.4814 1.0 5.7915 2.6630
2.6138 36.0 1116 2.4674 1.0 6.1883 2.7386
2.6138 37.0 1147 2.4581 1.0 5.3096 2.6823
2.5766 38.0 1178 2.4477 1.0 5.3723 2.7470
2.5413 39.0 1209 2.4335 1.0 5.3362 2.7325
2.5413 40.0 1240 2.4254 1.0 6.3199 2.8268
2.5103 41.0 1271 2.4123 1.0 6.5643 2.8315
2.4832 42.0 1302 2.4081 1.0 6.0125 2.9006
2.4832 43.0 1333 2.3951 1.0 6.0619 2.9568
2.4585 44.0 1364 2.3889 1.0 5.9724 3.0480
2.4585 45.0 1395 2.3860 1.0 6.7165 3.0197
2.4241 46.0 1426 2.3754 1.0 6.4182 3.0169
2.3965 47.0 1457 2.3647 1.0 6.5957 3.0111
2.3965 48.0 1488 2.3574 1.0 5.8117 2.9843
2.3654 49.0 1519 2.3501 1.0 5.4900 3.0144
2.3593 50.0 1550 2.3415 1.0 6.6882 3.0177

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/d431bd02a55bbd85866f1939f367ddca

Base model

google-t5/t5-small
Finetuned
(2244)
this model