792304578eeaaf421d91b32ca9e3f8d5
This model is a fine-tuned version of google/mt5-xl on the Helsinki-NLP/opus_books [es-fr] dataset. It achieves the following results on the evaluation set:
- Loss: 1.1469
- Data Size: 1.0
- Epoch Runtime: 641.1222
- Bleu: 14.9571
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 7.5319 | 0 | 42.1921 | 0.0227 |
| No log | 1 | 1407 | 2.4063 | 0.0078 | 47.0469 | 1.2099 |
| No log | 2 | 2814 | 1.7253 | 0.0156 | 59.5257 | 1.5937 |
| 0.0687 | 3 | 4221 | 1.4626 | 0.0312 | 73.6290 | 9.3270 |
| 1.7852 | 4 | 5628 | 1.3730 | 0.0625 | 94.4569 | 10.1598 |
| 1.6029 | 5 | 7035 | 1.2781 | 0.125 | 126.9140 | 11.3849 |
| 1.4152 | 6 | 8442 | 1.1953 | 0.25 | 199.0452 | 13.0616 |
| 1.3277 | 7 | 9849 | 1.1298 | 0.5 | 347.5435 | 13.6878 |
| 1.1487 | 8.0 | 11256 | 1.0761 | 1.0 | 644.9083 | 14.2348 |
| 1.0115 | 9.0 | 12663 | 1.0489 | 1.0 | 642.3715 | 14.7043 |
| 0.9556 | 10.0 | 14070 | 1.0474 | 1.0 | 639.1853 | 14.9049 |
| 0.8314 | 11.0 | 15477 | 1.0551 | 1.0 | 639.4692 | 14.8704 |
| 0.7299 | 12.0 | 16884 | 1.0761 | 1.0 | 641.4331 | 15.0293 |
| 0.6821 | 13.0 | 18291 | 1.1074 | 1.0 | 641.8712 | 14.9957 |
| 0.6183 | 14.0 | 19698 | 1.1469 | 1.0 | 641.1222 | 14.9571 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/792304578eeaaf421d91b32ca9e3f8d5
Base model
google/mt5-xl