29004e70c0b068445756e946fd41c0e8
This model is a fine-tuned version of google/mt5-small on the Helsinki-NLP/opus_books [fr-pt] dataset. It achieves the following results on the evaluation set:
- Loss: 2.3858
- Data Size: 1.0
- Epoch Runtime: 6.0970
- Bleu: 4.0242
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 27.5438 | 0 | 1.1526 | 0.0058 |
| No log | 1 | 31 | 28.4233 | 0.0078 | 1.2516 | 0.0063 |
| No log | 2 | 62 | 28.0428 | 0.0156 | 1.7271 | 0.0045 |
| No log | 3 | 93 | 26.5597 | 0.0312 | 2.1117 | 0.0059 |
| No log | 4 | 124 | 24.4093 | 0.0625 | 2.3954 | 0.0041 |
| No log | 5 | 155 | 22.2079 | 0.125 | 2.7253 | 0.0082 |
| No log | 6 | 186 | 17.6759 | 0.25 | 3.1398 | 0.0094 |
| 3.5054 | 7 | 217 | 13.5341 | 0.5 | 4.4290 | 0.0117 |
| 3.5054 | 8.0 | 248 | 9.8188 | 1.0 | 6.9570 | 0.0154 |
| 10.928 | 9.0 | 279 | 7.8998 | 1.0 | 6.9664 | 0.0167 |
| 10.4524 | 10.0 | 310 | 6.6437 | 1.0 | 6.9542 | 0.0186 |
| 10.4524 | 11.0 | 341 | 5.4714 | 1.0 | 7.4791 | 0.0186 |
| 7.6418 | 12.0 | 372 | 4.6749 | 1.0 | 7.6182 | 0.0317 |
| 6.0055 | 13.0 | 403 | 3.7281 | 1.0 | 7.4343 | 0.0887 |
| 6.0055 | 14.0 | 434 | 3.3707 | 1.0 | 8.3537 | 0.3530 |
| 4.9541 | 15.0 | 465 | 3.1714 | 1.0 | 5.9927 | 1.1423 |
| 4.9541 | 16.0 | 496 | 3.0677 | 1.0 | 5.9569 | 1.2514 |
| 4.4308 | 17.0 | 527 | 2.9753 | 1.0 | 5.8289 | 1.5034 |
| 4.1102 | 18.0 | 558 | 2.9014 | 1.0 | 6.2157 | 1.7000 |
| 4.1102 | 19.0 | 589 | 2.8467 | 1.0 | 6.3563 | 1.8398 |
| 3.9069 | 20.0 | 620 | 2.7960 | 1.0 | 6.8002 | 2.2300 |
| 3.7412 | 21.0 | 651 | 2.7565 | 1.0 | 7.2603 | 2.3792 |
| 3.7412 | 22.0 | 682 | 2.7301 | 1.0 | 7.5230 | 2.4908 |
| 3.6364 | 23.0 | 713 | 2.6889 | 1.0 | 7.9714 | 2.7077 |
| 3.6364 | 24.0 | 744 | 2.6600 | 1.0 | 9.1440 | 2.8903 |
| 3.498 | 25.0 | 775 | 2.6445 | 1.0 | 8.9545 | 2.9441 |
| 3.4366 | 26.0 | 806 | 2.6197 | 1.0 | 8.8483 | 2.9626 |
| 3.4366 | 27.0 | 837 | 2.5967 | 1.0 | 8.5830 | 3.0582 |
| 3.3506 | 28.0 | 868 | 2.5836 | 1.0 | 6.0944 | 3.1140 |
| 3.3506 | 29.0 | 899 | 2.5682 | 1.0 | 6.0129 | 3.1509 |
| 3.2866 | 30.0 | 930 | 2.5502 | 1.0 | 5.9571 | 3.1536 |
| 3.2124 | 31.0 | 961 | 2.5412 | 1.0 | 6.0268 | 3.2192 |
| 3.2124 | 32.0 | 992 | 2.5242 | 1.0 | 6.0170 | 3.2674 |
| 3.1606 | 33.0 | 1023 | 2.5124 | 1.0 | 6.0481 | 3.2754 |
| 3.1155 | 34.0 | 1054 | 2.4978 | 1.0 | 5.8964 | 3.2844 |
| 3.1155 | 35.0 | 1085 | 2.4921 | 1.0 | 5.9531 | 3.4115 |
| 3.0601 | 36.0 | 1116 | 2.4796 | 1.0 | 6.3192 | 3.5079 |
| 3.0601 | 37.0 | 1147 | 2.4673 | 1.0 | 6.8573 | 3.5407 |
| 2.998 | 38.0 | 1178 | 2.4623 | 1.0 | 7.3572 | 3.6119 |
| 2.9555 | 39.0 | 1209 | 2.4581 | 1.0 | 8.2948 | 3.6769 |
| 2.9555 | 40.0 | 1240 | 2.4465 | 1.0 | 8.2879 | 3.6685 |
| 2.9102 | 41.0 | 1271 | 2.4395 | 1.0 | 8.4370 | 3.7489 |
| 2.8699 | 42.0 | 1302 | 2.4366 | 1.0 | 8.7327 | 3.7991 |
| 2.8699 | 43.0 | 1333 | 2.4264 | 1.0 | 5.8849 | 3.7835 |
| 2.8366 | 44.0 | 1364 | 2.4188 | 1.0 | 5.8377 | 3.8390 |
| 2.8366 | 45.0 | 1395 | 2.4157 | 1.0 | 5.9253 | 3.9188 |
| 2.7911 | 46.0 | 1426 | 2.4078 | 1.0 | 6.1507 | 3.9007 |
| 2.764 | 47.0 | 1457 | 2.4002 | 1.0 | 5.9784 | 3.9577 |
| 2.764 | 48.0 | 1488 | 2.3913 | 1.0 | 5.9478 | 3.9526 |
| 2.7201 | 49.0 | 1519 | 2.3885 | 1.0 | 5.9027 | 3.9700 |
| 2.6912 | 50.0 | 1550 | 2.3858 | 1.0 | 6.0970 | 4.0242 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/29004e70c0b068445756e946fd41c0e8
Base model
google/mt5-small