eed3b2fc273ffb52c6b1fa7ecf793821
This model is a fine-tuned version of google/mt5-base on the Helsinki-NLP/opus_books [fr-pt] dataset. It achieves the following results on the evaluation set:
- Loss: 1.9388
- Data Size: 1.0
- Epoch Runtime: 11.6263
- Bleu: 7.2219
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 16.1264 | 0 | 1.5804 | 0.0231 |
| No log | 1 | 31 | 16.4643 | 0.0078 | 2.3928 | 0.0284 |
| No log | 2 | 62 | 15.9303 | 0.0156 | 2.0920 | 0.0267 |
| No log | 3 | 93 | 15.5384 | 0.0312 | 3.3902 | 0.0274 |
| No log | 4 | 124 | 15.0819 | 0.0625 | 4.8687 | 0.0206 |
| No log | 5 | 155 | 13.6719 | 0.125 | 5.6603 | 0.0197 |
| No log | 6 | 186 | 11.9069 | 0.25 | 7.7089 | 0.0322 |
| 2.3003 | 7 | 217 | 9.3145 | 0.5 | 10.0812 | 0.0321 |
| 2.3003 | 8.0 | 248 | 6.4098 | 1.0 | 14.7705 | 0.0199 |
| 6.9365 | 9.0 | 279 | 4.7352 | 1.0 | 14.1538 | 0.0096 |
| 6.3421 | 10.0 | 310 | 2.9529 | 1.0 | 9.7062 | 1.0909 |
| 6.3421 | 11.0 | 341 | 2.4681 | 1.0 | 10.1137 | 3.6803 |
| 3.9354 | 12.0 | 372 | 2.2692 | 1.0 | 10.2667 | 3.9580 |
| 3.216 | 13.0 | 403 | 2.1776 | 1.0 | 10.9836 | 4.2497 |
| 3.216 | 14.0 | 434 | 2.1226 | 1.0 | 12.7806 | 4.6474 |
| 2.9141 | 15.0 | 465 | 2.0874 | 1.0 | 11.9026 | 5.1961 |
| 2.9141 | 16.0 | 496 | 2.0459 | 1.0 | 12.2544 | 5.6613 |
| 2.691 | 17.0 | 527 | 2.0090 | 1.0 | 13.2220 | 5.9954 |
| 2.529 | 18.0 | 558 | 2.0037 | 1.0 | 9.8746 | 6.2197 |
| 2.529 | 19.0 | 589 | 1.9852 | 1.0 | 10.3038 | 6.4326 |
| 2.3882 | 20.0 | 620 | 1.9751 | 1.0 | 10.4815 | 6.4168 |
| 2.2867 | 21.0 | 651 | 1.9600 | 1.0 | 10.6459 | 6.4688 |
| 2.2867 | 22.0 | 682 | 1.9624 | 1.0 | 11.3954 | 6.8694 |
| 2.2047 | 23.0 | 713 | 1.9534 | 1.0 | 12.0309 | 7.1474 |
| 2.2047 | 24.0 | 744 | 1.9442 | 1.0 | 13.0364 | 7.1646 |
| 2.0857 | 25.0 | 775 | 1.9308 | 1.0 | 13.3050 | 6.8281 |
| 2.0494 | 26.0 | 806 | 1.9343 | 1.0 | 9.8075 | 6.8110 |
| 2.0494 | 27.0 | 837 | 1.9369 | 1.0 | 10.5133 | 6.8163 |
| 1.9657 | 28.0 | 868 | 1.9248 | 1.0 | 11.3402 | 7.0828 |
| 1.9657 | 29.0 | 899 | 1.9337 | 1.0 | 11.1664 | 7.0742 |
| 1.9108 | 30.0 | 930 | 1.9344 | 1.0 | 11.0983 | 7.0631 |
| 1.8423 | 31.0 | 961 | 1.9270 | 1.0 | 11.6784 | 7.1148 |
| 1.8423 | 32.0 | 992 | 1.9388 | 1.0 | 11.6263 | 7.2219 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/eed3b2fc273ffb52c6b1fa7ecf793821
Base model
google/mt5-base