a1a6aff4ce2d90f17a750fdcaf097651

This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books [de-pt] dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9747
  • Data Size: 1.0
  • Epoch Runtime: 10.4799
  • Bleu: 5.6590

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 3.5795 0 1.5110 0.8069
No log 1 27 3.5288 0.0078 2.0713 0.8193
No log 2 54 3.4226 0.0156 1.8171 0.8765
No log 3 81 3.3137 0.0312 2.1961 0.8726
No log 4 108 3.2241 0.0625 2.6187 0.9015
No log 5 135 3.0794 0.125 3.5629 0.9034
No log 6 162 2.8764 0.25 4.5319 1.6239
No log 7 189 2.6789 0.5 7.1745 1.8214
0.6673 8.0 216 2.5209 1.0 11.8647 2.1250
0.6673 9.0 243 2.4311 1.0 12.2110 2.9570
2.7346 10.0 270 2.3574 1.0 11.9172 3.0805
2.7346 11.0 297 2.3070 1.0 13.7312 3.3673
2.5374 12.0 324 2.2623 1.0 12.1588 3.6163
2.3816 13.0 351 2.2264 1.0 12.0713 3.4099
2.3816 14.0 378 2.1950 1.0 10.7154 3.7960
2.2706 15.0 405 2.1645 1.0 9.9418 4.1388
2.2706 16.0 432 2.1396 1.0 11.5808 4.2158
2.1735 17.0 459 2.1212 1.0 12.5215 4.3524
2.1735 18.0 486 2.1062 1.0 12.6577 4.3225
2.0687 19.0 513 2.0859 1.0 13.0955 4.5394
2.0687 20.0 540 2.0751 1.0 14.9231 4.4761
1.9922 21.0 567 2.0550 1.0 11.4518 4.8162
1.9922 22.0 594 2.0431 1.0 13.6035 4.8632
1.9275 23.0 621 2.0416 1.0 11.6460 4.6598
1.9275 24.0 648 2.0323 1.0 12.5673 4.7531
1.8617 25.0 675 2.0227 1.0 9.9415 4.9116
1.8078 26.0 702 2.0161 1.0 9.5311 4.7579
1.8078 27.0 729 2.0159 1.0 11.9414 4.7400
1.7452 28.0 756 1.9923 1.0 11.1788 5.1620
1.7452 29.0 783 1.9901 1.0 11.7848 5.1908
1.6985 30.0 810 1.9933 1.0 11.6818 5.2161
1.6985 31.0 837 1.9735 1.0 11.0877 5.2446
1.6343 32.0 864 1.9759 1.0 12.3549 5.2742
1.6343 33.0 891 1.9756 1.0 13.1103 5.0968
1.5958 34.0 918 1.9697 1.0 11.9184 5.3222
1.5958 35.0 945 1.9672 1.0 12.3234 5.3265
1.5497 36.0 972 1.9727 1.0 12.7994 5.2308
1.5497 37.0 999 1.9713 1.0 10.8372 5.2037
1.5086 38.0 1026 1.9620 1.0 11.8187 5.2829
1.4591 39.0 1053 1.9762 1.0 12.3488 5.2213
1.4591 40.0 1080 1.9759 1.0 12.8681 5.4821
1.4181 41.0 1107 1.9682 1.0 11.7268 5.5189
1.4181 42.0 1134 1.9747 1.0 10.4799 5.6590

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/a1a6aff4ce2d90f17a750fdcaf097651

Finetuned
(728)
this model