ea498f2f8800ae82b2b80e3f3e8bf7e9
This model is a fine-tuned version of google/mt5-small on the Helsinki-NLP/opus_books [fr-no] dataset. It achieves the following results on the evaluation set:
- Loss: 2.8706
- Data Size: 1.0
- Epoch Runtime: 13.7158
- Bleu: 2.8681
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 24.6813 | 0 | 1.7986 | 0.0009 |
| No log | 1 | 86 | 24.7378 | 0.0078 | 2.8486 | 0.0008 |
| No log | 2 | 172 | 24.3812 | 0.0156 | 2.3142 | 0.0008 |
| No log | 3 | 258 | 23.0589 | 0.0312 | 2.7868 | 0.0014 |
| No log | 4 | 344 | 20.8463 | 0.0625 | 3.5438 | 0.0015 |
| 1.3681 | 5 | 430 | 18.1595 | 0.125 | 4.3812 | 0.0027 |
| 5.2509 | 6 | 516 | 15.6510 | 0.25 | 6.1068 | 0.0020 |
| 5.5813 | 7 | 602 | 10.3997 | 0.5 | 8.8942 | 0.0033 |
| 6.4297 | 8.0 | 688 | 5.4761 | 1.0 | 15.3246 | 0.0122 |
| 6.5951 | 9.0 | 774 | 4.0052 | 1.0 | 13.4675 | 0.2429 |
| 5.0986 | 10.0 | 860 | 3.6742 | 1.0 | 14.4898 | 0.4658 |
| 4.841 | 11.0 | 946 | 3.5062 | 1.0 | 14.6042 | 0.6311 |
| 4.4852 | 12.0 | 1032 | 3.4093 | 1.0 | 14.1982 | 0.7627 |
| 4.2868 | 13.0 | 1118 | 3.3364 | 1.0 | 14.5429 | 1.0343 |
| 4.138 | 14.0 | 1204 | 3.2795 | 1.0 | 14.1576 | 1.2543 |
| 4.0571 | 15.0 | 1290 | 3.2347 | 1.0 | 14.0612 | 1.3775 |
| 3.9666 | 16.0 | 1376 | 3.2016 | 1.0 | 13.3305 | 1.4498 |
| 3.8844 | 17.0 | 1462 | 3.1669 | 1.0 | 13.7788 | 1.5649 |
| 3.8281 | 18.0 | 1548 | 3.1439 | 1.0 | 14.5092 | 1.6304 |
| 3.7532 | 19.0 | 1634 | 3.1191 | 1.0 | 14.4961 | 1.6749 |
| 3.7084 | 20.0 | 1720 | 3.0959 | 1.0 | 14.8120 | 1.8000 |
| 3.6705 | 21.0 | 1806 | 3.0784 | 1.0 | 14.5898 | 1.8761 |
| 3.6522 | 22.0 | 1892 | 3.0642 | 1.0 | 14.6089 | 1.8872 |
| 3.5636 | 23.0 | 1978 | 3.0461 | 1.0 | 13.4842 | 1.9836 |
| 3.551 | 24.0 | 2064 | 3.0356 | 1.0 | 13.7628 | 2.0694 |
| 3.4916 | 25.0 | 2150 | 3.0229 | 1.0 | 14.1647 | 2.0876 |
| 3.4554 | 26.0 | 2236 | 3.0042 | 1.0 | 14.4404 | 2.1515 |
| 3.4224 | 27.0 | 2322 | 2.9986 | 1.0 | 15.4571 | 2.1357 |
| 3.3942 | 28.0 | 2408 | 2.9844 | 1.0 | 15.4711 | 2.1600 |
| 3.3785 | 29.0 | 2494 | 2.9769 | 1.0 | 15.4662 | 2.2534 |
| 3.3361 | 30.0 | 2580 | 2.9689 | 1.0 | 15.7904 | 2.2643 |
| 3.3061 | 31.0 | 2666 | 2.9625 | 1.0 | 14.0855 | 2.2744 |
| 3.2623 | 32.0 | 2752 | 2.9541 | 1.0 | 13.9531 | 2.3525 |
| 3.2314 | 33.0 | 2838 | 2.9447 | 1.0 | 13.9750 | 2.3598 |
| 3.2346 | 34.0 | 2924 | 2.9440 | 1.0 | 14.0985 | 2.4240 |
| 3.2265 | 35.0 | 3010 | 2.9302 | 1.0 | 14.2094 | 2.4658 |
| 3.1866 | 36.0 | 3096 | 2.9258 | 1.0 | 13.8554 | 2.4391 |
| 3.1598 | 37.0 | 3182 | 2.9187 | 1.0 | 14.1753 | 2.4766 |
| 3.1336 | 38.0 | 3268 | 2.9156 | 1.0 | 14.1022 | 2.4855 |
| 3.0847 | 39.0 | 3354 | 2.9069 | 1.0 | 13.5672 | 2.5021 |
| 3.0866 | 40.0 | 3440 | 2.9001 | 1.0 | 13.4121 | 2.5222 |
| 3.0738 | 41.0 | 3526 | 2.8992 | 1.0 | 13.9182 | 2.5807 |
| 3.0578 | 42.0 | 3612 | 2.8924 | 1.0 | 14.3426 | 2.6003 |
| 3.0272 | 43.0 | 3698 | 2.8916 | 1.0 | 15.0226 | 2.6529 |
| 3.0087 | 44.0 | 3784 | 2.8856 | 1.0 | 15.5647 | 2.7099 |
| 2.9882 | 45.0 | 3870 | 2.8847 | 1.0 | 15.5668 | 2.7470 |
| 2.9678 | 46.0 | 3956 | 2.8827 | 1.0 | 15.8990 | 2.7260 |
| 2.9595 | 47.0 | 4042 | 2.8777 | 1.0 | 13.5686 | 2.7856 |
| 2.9079 | 48.0 | 4128 | 2.8710 | 1.0 | 13.6765 | 2.8488 |
| 2.9165 | 49.0 | 4214 | 2.8716 | 1.0 | 13.7148 | 2.8691 |
| 2.8845 | 50.0 | 4300 | 2.8706 | 1.0 | 13.7158 | 2.8681 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/ea498f2f8800ae82b2b80e3f3e8bf7e9
Base model
google/mt5-small