2cbae2386b74a080f53eb5109ca5971c
This model is a fine-tuned version of google/mt5-base on the Helsinki-NLP/opus_books [es-nl] dataset. It achieves the following results on the evaluation set:
- Loss: 1.8544
- Data Size: 1.0
- Epoch Runtime: 169.6302
- Bleu: 7.9441
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 17.1342 | 0 | 14.2164 | 0.0123 |
| No log | 1 | 806 | 14.4190 | 0.0078 | 15.4260 | 0.0141 |
| No log | 2 | 1612 | 12.9927 | 0.0156 | 17.8722 | 0.0145 |
| No log | 3 | 2418 | 12.3336 | 0.0312 | 21.2802 | 0.0119 |
| 0.4603 | 4 | 3224 | 8.7273 | 0.0625 | 25.7241 | 0.0168 |
| 7.8757 | 5 | 4030 | 5.7522 | 0.125 | 36.1101 | 0.0238 |
| 3.8606 | 6 | 4836 | 2.8423 | 0.25 | 56.7123 | 2.2133 |
| 3.2748 | 7 | 5642 | 2.5475 | 0.5 | 95.4039 | 3.4680 |
| 2.9493 | 8.0 | 6448 | 2.3580 | 1.0 | 170.7454 | 4.3037 |
| 2.7582 | 9.0 | 7254 | 2.2663 | 1.0 | 169.8955 | 4.7458 |
| 2.6109 | 10.0 | 8060 | 2.1829 | 1.0 | 169.9373 | 5.2368 |
| 2.5575 | 11.0 | 8866 | 2.1357 | 1.0 | 169.9412 | 5.5100 |
| 2.445 | 12.0 | 9672 | 2.0929 | 1.0 | 169.8436 | 5.6948 |
| 2.3351 | 13.0 | 10478 | 2.0607 | 1.0 | 170.8743 | 6.0219 |
| 2.3008 | 14.0 | 11284 | 2.0226 | 1.0 | 170.3089 | 6.2488 |
| 2.2024 | 15.0 | 12090 | 1.9973 | 1.0 | 170.4874 | 6.4205 |
| 2.1786 | 16.0 | 12896 | 1.9712 | 1.0 | 169.2548 | 6.6095 |
| 2.1158 | 17.0 | 13702 | 1.9576 | 1.0 | 169.8372 | 6.7398 |
| 2.0339 | 18.0 | 14508 | 1.9481 | 1.0 | 170.9875 | 6.8626 |
| 2.0602 | 19.0 | 15314 | 1.9307 | 1.0 | 169.0190 | 7.0636 |
| 1.9852 | 20.0 | 16120 | 1.9111 | 1.0 | 168.9778 | 7.1152 |
| 1.9701 | 21.0 | 16926 | 1.9002 | 1.0 | 170.2868 | 7.1676 |
| 1.9284 | 22.0 | 17732 | 1.8983 | 1.0 | 170.0649 | 7.2960 |
| 1.8985 | 23.0 | 18538 | 1.8850 | 1.0 | 169.0035 | 7.2971 |
| 1.8702 | 24.0 | 19344 | 1.8737 | 1.0 | 169.1320 | 7.4994 |
| 1.8144 | 25.0 | 20150 | 1.8745 | 1.0 | 170.3661 | 7.5100 |
| 1.788 | 26.0 | 20956 | 1.8647 | 1.0 | 169.8877 | 7.5469 |
| 1.7573 | 27.0 | 21762 | 1.8660 | 1.0 | 178.3417 | 7.5739 |
| 1.7072 | 28.0 | 22568 | 1.8523 | 1.0 | 170.5982 | 7.6679 |
| 1.6876 | 29.0 | 23374 | 1.8576 | 1.0 | 170.2814 | 7.7006 |
| 1.6681 | 30.0 | 24180 | 1.8508 | 1.0 | 168.5706 | 7.7472 |
| 1.6361 | 31.0 | 24986 | 1.8524 | 1.0 | 169.2578 | 7.7534 |
| 1.6185 | 32.0 | 25792 | 1.8519 | 1.0 | 168.9454 | 7.7990 |
| 1.6048 | 33.0 | 26598 | 1.8484 | 1.0 | 171.0696 | 7.9047 |
| 1.5681 | 34.0 | 27404 | 1.8498 | 1.0 | 169.9789 | 7.9203 |
| 1.5462 | 35.0 | 28210 | 1.8516 | 1.0 | 168.9406 | 7.9122 |
| 1.5238 | 36.0 | 29016 | 1.8516 | 1.0 | 169.5784 | 7.9168 |
| 1.4893 | 37.0 | 29822 | 1.8544 | 1.0 | 169.6302 | 7.9441 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/2cbae2386b74a080f53eb5109ca5971c
Base model
google/mt5-base