eb96ab762c7ec109e9a637d60f7f8908

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [it-pt] dataset. It achieves the following results on the evaluation set:

  • Loss: 2.5571
  • Data Size: 1.0
  • Epoch Runtime: 12.1514
  • Bleu: 7.4817

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 11.6380 0 1.3179 0.6688
No log 1 29 11.6224 0.0078 1.6882 0.7427
No log 2 58 11.5177 0.0156 2.2360 0.6516
No log 3 87 11.7234 0.0312 2.8411 0.7009
No log 4 116 11.3968 0.0625 3.4966 0.7370
No log 5 145 11.3497 0.125 3.9657 0.4774
1.7156 6 174 10.9542 0.25 5.5267 0.6996
1.7156 7 203 11.0380 0.5 7.5773 0.6731
1.7156 8.0 232 9.1387 1.0 11.6909 0.9107
9.7807 9.0 261 8.0497 1.0 11.8910 0.8087
9.7807 10.0 290 7.0794 1.0 13.0532 0.4845
10.6168 11.0 319 6.2391 1.0 10.1422 0.4629
10.6168 12.0 348 4.6299 1.0 10.3170 1.6175
7.3694 13.0 377 3.9073 1.0 10.8678 3.5145
5.0901 14.0 406 3.5140 1.0 12.0403 6.2229
5.0901 15.0 435 3.2423 1.0 12.1298 6.7949
4.3639 16.0 464 3.0647 1.0 12.8197 4.6863
4.3639 17.0 493 2.9499 1.0 13.1804 5.1550
3.9547 18.0 522 2.8692 1.0 9.7197 5.5632
3.6322 19.0 551 2.8151 1.0 9.4106 5.8590
3.6322 20.0 580 2.7593 1.0 10.5028 6.1529
3.3638 21.0 609 2.7053 1.0 10.6758 6.3736
3.3638 22.0 638 2.6837 1.0 11.2459 6.4486
3.1785 23.0 667 2.6556 1.0 11.2236 6.4821
3.1785 24.0 696 2.6270 1.0 11.6886 6.6401
3.0074 25.0 725 2.6063 1.0 11.7180 6.7628
2.8686 26.0 754 2.5905 1.0 9.8356 6.8415
2.8686 27.0 783 2.5792 1.0 10.1183 6.9304
2.7418 28.0 812 2.5736 1.0 10.2673 7.0236
2.7418 29.0 841 2.5536 1.0 10.7104 7.0489
2.6142 30.0 870 2.5487 1.0 11.2792 7.1639
2.6142 31.0 899 2.5396 1.0 11.2350 7.1134
2.5155 32.0 928 2.5422 1.0 11.5853 7.2230
2.4294 33.0 957 2.5336 1.0 12.6890 7.2554
2.4294 34.0 986 2.5357 1.0 9.4938 7.3031
2.332 35.0 1015 2.5287 1.0 9.4530 7.3458
2.332 36.0 1044 2.5396 1.0 9.9192 7.4489
2.2852 37.0 1073 2.5445 1.0 10.7865 7.6072
2.1983 38.0 1102 2.5372 1.0 11.6134 7.4945
2.1983 39.0 1131 2.5571 1.0 12.1514 7.4817

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
1.0B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/eb96ab762c7ec109e9a637d60f7f8908

Base model

google/umt5-base
Finetuned
(49)
this model