f5d3ae17b1930dbf87f1b7283f9ed30c

This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2291
  • Data Size: 1.0
  • Epoch Runtime: 15.0556
  • Bleu: 14.7465

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 2.9686 0 1.7897 1.7611
No log 1 35 2.9335 0.0078 2.3796 1.8142
No log 2 70 2.8721 0.0156 2.0602 1.9906
No log 3 105 2.7471 0.0312 2.8569 2.2344
No log 4 140 2.6188 0.0625 3.2800 2.4158
No log 5 175 2.4610 0.125 4.1849 2.7923
No log 6 210 2.2644 0.25 5.1989 3.8902
No log 7 245 2.0473 0.5 8.8507 4.9557
0.4917 8.0 280 1.8653 1.0 14.1202 6.9420
2.2497 9.0 315 1.7499 1.0 13.9608 7.6192
2.0342 10.0 350 1.6686 1.0 12.5722 8.9947
2.0342 11.0 385 1.5980 1.0 13.6303 9.2455
1.8555 12.0 420 1.5371 1.0 13.5858 10.1462
1.7107 13.0 455 1.4986 1.0 10.9217 10.5372
1.7107 14.0 490 1.4559 1.0 10.9373 9.8687
1.5995 15.0 525 1.4161 1.0 10.3431 10.7604
1.53 16.0 560 1.3893 1.0 10.7269 11.3155
1.53 17.0 595 1.3668 1.0 12.1763 11.5311
1.4222 18.0 630 1.3408 1.0 11.1285 11.4685
1.3365 19.0 665 1.3257 1.0 13.1371 12.1346
1.283 20.0 700 1.3018 1.0 13.3854 12.1569
1.283 21.0 735 1.2920 1.0 14.0139 12.5417
1.2155 22.0 770 1.2809 1.0 13.5094 12.6887
1.1635 23.0 805 1.2769 1.0 11.9000 13.0757
1.1635 24.0 840 1.2598 1.0 14.3637 13.3410
1.0967 25.0 875 1.2532 1.0 14.3873 13.5607
1.0632 26.0 910 1.2479 1.0 14.0800 13.8649
1.0632 27.0 945 1.2404 1.0 14.7500 13.7716
1.0084 28.0 980 1.2475 1.0 13.1077 13.9105
0.9661 29.0 1015 1.2323 1.0 15.0026 13.9146
0.9381 30.0 1050 1.2244 1.0 14.5107 14.0155
0.9381 31.0 1085 1.2323 1.0 14.6993 14.1804
0.8814 32.0 1120 1.2260 1.0 13.6416 14.1333
0.8562 33.0 1155 1.2231 1.0 13.8906 14.8269
0.8562 34.0 1190 1.2253 1.0 15.0576 14.2483
0.8088 35.0 1225 1.2243 1.0 13.6468 14.5773
0.7953 36.0 1260 1.2268 1.0 13.1516 14.5130
0.7953 37.0 1295 1.2291 1.0 15.0556 14.7465

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/f5d3ae17b1930dbf87f1b7283f9ed30c

Finetuned
(723)
this model