7132eac5f8a3d007b76b88db764976c7
This model is a fine-tuned version of google/mt5-base on the Helsinki-NLP/opus_books [de-nl] dataset. It achieves the following results on the evaluation set:
- Loss: 2.0901
- Data Size: 1.0
- Epoch Runtime: 81.3317
- Bleu: 7.8459
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 15.2536 | 0 | 7.2985 | 0.0211 |
| No log | 1 | 390 | 15.6662 | 0.0078 | 8.9328 | 0.0211 |
| No log | 2 | 780 | 13.8073 | 0.0156 | 8.9606 | 0.0190 |
| No log | 3 | 1170 | 13.4585 | 0.0312 | 11.2179 | 0.0221 |
| No log | 4 | 1560 | 11.7099 | 0.0625 | 14.1549 | 0.0137 |
| 0.7843 | 5 | 1950 | 9.2046 | 0.125 | 20.1830 | 0.0089 |
| 1.4287 | 6 | 2340 | 4.9232 | 0.25 | 29.3691 | 0.0543 |
| 4.054 | 7 | 2730 | 2.9184 | 0.5 | 46.5163 | 2.9863 |
| 3.1519 | 8.0 | 3120 | 2.5208 | 1.0 | 85.5254 | 4.8923 |
| 2.9129 | 9.0 | 3510 | 2.4149 | 1.0 | 86.1225 | 5.4005 |
| 2.7849 | 10.0 | 3900 | 2.3566 | 1.0 | 83.9788 | 5.6694 |
| 2.6958 | 11.0 | 4290 | 2.3041 | 1.0 | 82.6600 | 5.9535 |
| 2.5942 | 12.0 | 4680 | 2.2601 | 1.0 | 83.0294 | 6.2272 |
| 2.5384 | 13.0 | 5070 | 2.2363 | 1.0 | 82.6532 | 6.3636 |
| 2.4667 | 14.0 | 5460 | 2.2089 | 1.0 | 82.2912 | 6.5239 |
| 2.3524 | 15.0 | 5850 | 2.1965 | 1.0 | 82.5349 | 6.6398 |
| 2.3166 | 16.0 | 6240 | 2.1625 | 1.0 | 81.8239 | 6.7495 |
| 2.2953 | 17.0 | 6630 | 2.1555 | 1.0 | 81.9901 | 6.8783 |
| 2.2332 | 18.0 | 7020 | 2.1440 | 1.0 | 83.2302 | 7.0288 |
| 2.1893 | 19.0 | 7410 | 2.1239 | 1.0 | 83.5916 | 7.1203 |
| 2.1559 | 20.0 | 7800 | 2.1173 | 1.0 | 81.5685 | 7.1722 |
| 2.1161 | 21.0 | 8190 | 2.1141 | 1.0 | 81.6963 | 7.3281 |
| 2.043 | 22.0 | 8580 | 2.1068 | 1.0 | 82.8431 | 7.4121 |
| 2.0294 | 23.0 | 8970 | 2.1009 | 1.0 | 81.9595 | 7.4750 |
| 1.982 | 24.0 | 9360 | 2.0951 | 1.0 | 80.4355 | 7.5181 |
| 1.9439 | 25.0 | 9750 | 2.0982 | 1.0 | 81.8319 | 7.5364 |
| 1.9049 | 26.0 | 10140 | 2.1002 | 1.0 | 82.3029 | 7.6096 |
| 1.8619 | 27.0 | 10530 | 2.0860 | 1.0 | 81.8496 | 7.6852 |
| 1.8604 | 28.0 | 10920 | 2.0850 | 1.0 | 81.8561 | 7.7166 |
| 1.8383 | 29.0 | 11310 | 2.0821 | 1.0 | 81.0283 | 7.7468 |
| 1.796 | 30.0 | 11700 | 2.0823 | 1.0 | 82.0984 | 7.7893 |
| 1.7357 | 31.0 | 12090 | 2.0848 | 1.0 | 81.7833 | 7.8345 |
| 1.707 | 32.0 | 12480 | 2.0905 | 1.0 | 82.0953 | 7.8734 |
| 1.7016 | 33.0 | 12870 | 2.0901 | 1.0 | 81.3317 | 7.8459 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/7132eac5f8a3d007b76b88db764976c7
Base model
google/mt5-base