eb96ab762c7ec109e9a637d60f7f8908
This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [it-pt] dataset. It achieves the following results on the evaluation set:
- Loss: 2.5571
- Data Size: 1.0
- Epoch Runtime: 12.1514
- Bleu: 7.4817
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 11.6380 | 0 | 1.3179 | 0.6688 |
| No log | 1 | 29 | 11.6224 | 0.0078 | 1.6882 | 0.7427 |
| No log | 2 | 58 | 11.5177 | 0.0156 | 2.2360 | 0.6516 |
| No log | 3 | 87 | 11.7234 | 0.0312 | 2.8411 | 0.7009 |
| No log | 4 | 116 | 11.3968 | 0.0625 | 3.4966 | 0.7370 |
| No log | 5 | 145 | 11.3497 | 0.125 | 3.9657 | 0.4774 |
| 1.7156 | 6 | 174 | 10.9542 | 0.25 | 5.5267 | 0.6996 |
| 1.7156 | 7 | 203 | 11.0380 | 0.5 | 7.5773 | 0.6731 |
| 1.7156 | 8.0 | 232 | 9.1387 | 1.0 | 11.6909 | 0.9107 |
| 9.7807 | 9.0 | 261 | 8.0497 | 1.0 | 11.8910 | 0.8087 |
| 9.7807 | 10.0 | 290 | 7.0794 | 1.0 | 13.0532 | 0.4845 |
| 10.6168 | 11.0 | 319 | 6.2391 | 1.0 | 10.1422 | 0.4629 |
| 10.6168 | 12.0 | 348 | 4.6299 | 1.0 | 10.3170 | 1.6175 |
| 7.3694 | 13.0 | 377 | 3.9073 | 1.0 | 10.8678 | 3.5145 |
| 5.0901 | 14.0 | 406 | 3.5140 | 1.0 | 12.0403 | 6.2229 |
| 5.0901 | 15.0 | 435 | 3.2423 | 1.0 | 12.1298 | 6.7949 |
| 4.3639 | 16.0 | 464 | 3.0647 | 1.0 | 12.8197 | 4.6863 |
| 4.3639 | 17.0 | 493 | 2.9499 | 1.0 | 13.1804 | 5.1550 |
| 3.9547 | 18.0 | 522 | 2.8692 | 1.0 | 9.7197 | 5.5632 |
| 3.6322 | 19.0 | 551 | 2.8151 | 1.0 | 9.4106 | 5.8590 |
| 3.6322 | 20.0 | 580 | 2.7593 | 1.0 | 10.5028 | 6.1529 |
| 3.3638 | 21.0 | 609 | 2.7053 | 1.0 | 10.6758 | 6.3736 |
| 3.3638 | 22.0 | 638 | 2.6837 | 1.0 | 11.2459 | 6.4486 |
| 3.1785 | 23.0 | 667 | 2.6556 | 1.0 | 11.2236 | 6.4821 |
| 3.1785 | 24.0 | 696 | 2.6270 | 1.0 | 11.6886 | 6.6401 |
| 3.0074 | 25.0 | 725 | 2.6063 | 1.0 | 11.7180 | 6.7628 |
| 2.8686 | 26.0 | 754 | 2.5905 | 1.0 | 9.8356 | 6.8415 |
| 2.8686 | 27.0 | 783 | 2.5792 | 1.0 | 10.1183 | 6.9304 |
| 2.7418 | 28.0 | 812 | 2.5736 | 1.0 | 10.2673 | 7.0236 |
| 2.7418 | 29.0 | 841 | 2.5536 | 1.0 | 10.7104 | 7.0489 |
| 2.6142 | 30.0 | 870 | 2.5487 | 1.0 | 11.2792 | 7.1639 |
| 2.6142 | 31.0 | 899 | 2.5396 | 1.0 | 11.2350 | 7.1134 |
| 2.5155 | 32.0 | 928 | 2.5422 | 1.0 | 11.5853 | 7.2230 |
| 2.4294 | 33.0 | 957 | 2.5336 | 1.0 | 12.6890 | 7.2554 |
| 2.4294 | 34.0 | 986 | 2.5357 | 1.0 | 9.4938 | 7.3031 |
| 2.332 | 35.0 | 1015 | 2.5287 | 1.0 | 9.4530 | 7.3458 |
| 2.332 | 36.0 | 1044 | 2.5396 | 1.0 | 9.9192 | 7.4489 |
| 2.2852 | 37.0 | 1073 | 2.5445 | 1.0 | 10.7865 | 7.6072 |
| 2.1983 | 38.0 | 1102 | 2.5372 | 1.0 | 11.6134 | 7.4945 |
| 2.1983 | 39.0 | 1131 | 2.5571 | 1.0 | 12.1514 | 7.4817 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/eb96ab762c7ec109e9a637d60f7f8908
Base model
google/umt5-base