0ad8c2ecf6815815030b4e207bb2af1b

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [fr-pt] dataset. It achieves the following results on the evaluation set:

  • Loss: 2.3568
  • Data Size: 1.0
  • Epoch Runtime: 12.8602
  • Bleu: 7.8708

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 11.7090 0 1.3744 0.1954
No log 1 31 11.8378 0.0078 1.5109 0.1584
No log 2 62 11.5957 0.0156 2.1794 0.2111
No log 3 93 11.5673 0.0312 2.8158 0.1973
No log 4 124 11.5255 0.0625 3.8381 0.1563
No log 5 155 11.1080 0.125 4.1333 0.1654
No log 6 186 10.3154 0.25 5.2091 0.2149
2.5995 7 217 9.4188 0.5 7.4415 0.1732
2.5995 8.0 248 8.2699 1.0 11.7606 0.2447
9.2674 9.0 279 7.8717 1.0 11.9712 0.3183
10.3468 10.0 310 6.3612 1.0 12.8705 0.4939
10.3468 11.0 341 5.1569 1.0 9.7600 1.3232
7.6294 12.0 372 4.0364 1.0 10.9309 3.9073
5.7404 13.0 403 3.5976 1.0 10.8226 6.0704
5.7404 14.0 434 3.2598 1.0 11.1467 7.2694
4.7752 15.0 465 2.9971 1.0 12.2766 4.2953
4.7752 16.0 496 2.8488 1.0 12.6613 4.7700
4.213 17.0 527 2.7522 1.0 13.0987 5.1268
3.7812 18.0 558 2.6686 1.0 10.4024 5.3641
3.7812 19.0 589 2.6141 1.0 10.4839 5.6942
3.4767 20.0 620 2.5515 1.0 10.5228 6.3556
3.2547 21.0 651 2.5141 1.0 11.6514 6.3945
3.2547 22.0 682 2.4885 1.0 12.0017 6.6382
3.0838 23.0 713 2.4586 1.0 11.9956 6.8956
3.0838 24.0 744 2.4497 1.0 12.6230 6.8351
2.8882 25.0 775 2.4209 1.0 10.2937 6.8204
2.7929 26.0 806 2.4110 1.0 10.1908 6.9615
2.7929 27.0 837 2.4048 1.0 10.9968 7.0479
2.6492 28.0 868 2.3790 1.0 11.5217 7.0460
2.6492 29.0 899 2.3759 1.0 11.4738 7.1835
2.5644 30.0 930 2.3645 1.0 12.3540 7.2969
2.481 31.0 961 2.3603 1.0 12.4761 7.3014
2.481 32.0 992 2.3567 1.0 9.9603 7.4073
2.3755 33.0 1023 2.3446 1.0 10.7345 7.4959
2.2968 34.0 1054 2.3546 1.0 11.8768 7.4700
2.2968 35.0 1085 2.3649 1.0 12.0682 7.5651
2.2337 36.0 1116 2.3586 1.0 12.8577 7.7334
2.2337 37.0 1147 2.3568 1.0 12.8602 7.8708

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
1.0B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/0ad8c2ecf6815815030b4e207bb2af1b

Base model

google/umt5-base
Finetuned
(49)
this model