f91a23cecfbd289cf1348474d80a5c33
This model is a fine-tuned version of google/mt5-large on the Helsinki-NLP/opus_books [es-no] dataset. It achieves the following results on the evaluation set:
- Loss: 2.0487
- Data Size: 1.0
- Epoch Runtime: 42.6999
- Bleu: 5.9151
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 23.0375 | 0 | 3.6663 | 0.0086 |
| No log | 1 | 89 | 22.5653 | 0.0078 | 4.0129 | 0.0082 |
| No log | 2 | 178 | 18.5656 | 0.0156 | 7.1982 | 0.0092 |
| No log | 3 | 267 | 19.4656 | 0.0312 | 9.5094 | 0.0111 |
| No log | 4 | 356 | 14.8450 | 0.0625 | 11.7275 | 0.0136 |
| No log | 5 | 445 | 5.3013 | 0.125 | 14.7470 | 0.0381 |
| 0.9998 | 6 | 534 | 3.6382 | 0.25 | 17.6255 | 0.2923 |
| 1.8066 | 7 | 623 | 2.8759 | 0.5 | 27.2032 | 0.5935 |
| 3.1879 | 8.0 | 712 | 2.2714 | 1.0 | 44.2342 | 4.3190 |
| 2.6524 | 9.0 | 801 | 2.1324 | 1.0 | 44.1891 | 4.7504 |
| 2.512 | 10.0 | 890 | 2.0754 | 1.0 | 41.0263 | 5.1063 |
| 2.3432 | 11.0 | 979 | 2.0498 | 1.0 | 42.0423 | 5.2642 |
| 2.1826 | 12.0 | 1068 | 2.0215 | 1.0 | 41.9066 | 5.5056 |
| 2.0948 | 13.0 | 1157 | 2.0094 | 1.0 | 42.1578 | 5.5709 |
| 2.0081 | 14.0 | 1246 | 2.0083 | 1.0 | 43.5657 | 5.6278 |
| 1.929 | 15.0 | 1335 | 2.0098 | 1.0 | 42.0550 | 5.6109 |
| 1.8026 | 16.0 | 1424 | 2.0081 | 1.0 | 43.1516 | 5.8887 |
| 1.7762 | 17.0 | 1513 | 2.0119 | 1.0 | 41.7893 | 5.8135 |
| 1.6862 | 18.0 | 1602 | 2.0207 | 1.0 | 43.1757 | 5.8991 |
| 1.6062 | 19.0 | 1691 | 2.0286 | 1.0 | 42.1801 | 5.9735 |
| 1.546 | 20.0 | 1780 | 2.0487 | 1.0 | 42.6999 | 5.9151 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/f91a23cecfbd289cf1348474d80a5c33
Base model
google/mt5-large