53206cd9f8a5c89a5c063d2df26e6587
This model is a fine-tuned version of google/mt5-small on the Helsinki-NLP/opus_books [de-nl] dataset. It achieves the following results on the evaluation set:
- Loss: 2.4075
- Data Size: 1.0
- Epoch Runtime: 56.8618
- Bleu: 6.1217
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 26.6082 | 0 | 5.3460 | 0.0037 |
| No log | 1 | 390 | 24.8507 | 0.0078 | 6.9534 | 0.0039 |
| No log | 2 | 780 | 21.8076 | 0.0156 | 6.6067 | 0.0060 |
| No log | 3 | 1170 | 18.9334 | 0.0312 | 8.1142 | 0.0033 |
| No log | 4 | 1560 | 14.8386 | 0.0625 | 9.2352 | 0.0070 |
| 1.0192 | 5 | 1950 | 9.1086 | 0.125 | 12.6142 | 0.0117 |
| 1.537 | 6 | 2340 | 5.9527 | 0.25 | 19.1072 | 0.0181 |
| 5.3284 | 7 | 2730 | 3.7119 | 0.5 | 31.2303 | 1.1817 |
| 4.326 | 8.0 | 3120 | 3.3122 | 1.0 | 56.5618 | 2.1861 |
| 3.9947 | 9.0 | 3510 | 3.1591 | 1.0 | 56.2062 | 2.6803 |
| 3.8043 | 10.0 | 3900 | 3.0513 | 1.0 | 56.8248 | 3.0562 |
| 3.6641 | 11.0 | 4290 | 2.9802 | 1.0 | 57.5752 | 3.4920 |
| 3.5628 | 12.0 | 4680 | 2.9200 | 1.0 | 57.0053 | 3.7333 |
| 3.5145 | 13.0 | 5070 | 2.8800 | 1.0 | 56.2891 | 3.9421 |
| 3.4306 | 14.0 | 5460 | 2.8305 | 1.0 | 56.1577 | 4.1149 |
| 3.3156 | 15.0 | 5850 | 2.7955 | 1.0 | 56.5411 | 4.2152 |
| 3.2754 | 16.0 | 6240 | 2.7686 | 1.0 | 54.6214 | 4.4542 |
| 3.253 | 17.0 | 6630 | 2.7444 | 1.0 | 55.0890 | 4.4858 |
| 3.1731 | 18.0 | 7020 | 2.7225 | 1.0 | 56.6047 | 4.5586 |
| 3.148 | 19.0 | 7410 | 2.6913 | 1.0 | 55.6237 | 4.6820 |
| 3.1067 | 20.0 | 7800 | 2.6790 | 1.0 | 55.5254 | 4.7541 |
| 3.0622 | 21.0 | 8190 | 2.6594 | 1.0 | 56.0049 | 4.8190 |
| 3.0174 | 22.0 | 8580 | 2.6329 | 1.0 | 56.9300 | 4.8789 |
| 3.0019 | 23.0 | 8970 | 2.6250 | 1.0 | 55.3320 | 4.9717 |
| 2.9484 | 24.0 | 9360 | 2.6063 | 1.0 | 55.1256 | 5.0273 |
| 2.9187 | 25.0 | 9750 | 2.5972 | 1.0 | 56.3781 | 5.0626 |
| 2.9095 | 26.0 | 10140 | 2.5853 | 1.0 | 56.1055 | 5.1179 |
| 2.8524 | 27.0 | 10530 | 2.5713 | 1.0 | 56.4429 | 5.2100 |
| 2.8516 | 28.0 | 10920 | 2.5601 | 1.0 | 55.4443 | 5.2673 |
| 2.8325 | 29.0 | 11310 | 2.5430 | 1.0 | 56.2263 | 5.3145 |
| 2.8002 | 30.0 | 11700 | 2.5319 | 1.0 | 56.3833 | 5.3782 |
| 2.735 | 31.0 | 12090 | 2.5281 | 1.0 | 56.4733 | 5.4505 |
| 2.7312 | 32.0 | 12480 | 2.5184 | 1.0 | 56.4822 | 5.4686 |
| 2.7051 | 33.0 | 12870 | 2.5092 | 1.0 | 56.9100 | 5.5074 |
| 2.6724 | 34.0 | 13260 | 2.5067 | 1.0 | 55.9261 | 5.5596 |
| 2.674 | 35.0 | 13650 | 2.4875 | 1.0 | 55.5195 | 5.6071 |
| 2.6742 | 36.0 | 14040 | 2.4842 | 1.0 | 56.7434 | 5.6734 |
| 2.6162 | 37.0 | 14430 | 2.4774 | 1.0 | 56.5541 | 5.7158 |
| 2.581 | 38.0 | 14820 | 2.4693 | 1.0 | 55.7454 | 5.7728 |
| 2.5671 | 39.0 | 15210 | 2.4665 | 1.0 | 56.4713 | 5.7896 |
| 2.5664 | 40.0 | 15600 | 2.4549 | 1.0 | 55.3665 | 5.8258 |
| 2.5588 | 41.0 | 15990 | 2.4533 | 1.0 | 56.0915 | 5.8644 |
| 2.545 | 42.0 | 16380 | 2.4522 | 1.0 | 56.1284 | 5.8635 |
| 2.5216 | 43.0 | 16770 | 2.4378 | 1.0 | 55.3636 | 5.9064 |
| 2.5006 | 44.0 | 17160 | 2.4335 | 1.0 | 58.8562 | 5.9244 |
| 2.4887 | 45.0 | 17550 | 2.4291 | 1.0 | 58.5788 | 5.9757 |
| 2.4745 | 46.0 | 17940 | 2.4264 | 1.0 | 58.2341 | 6.0034 |
| 2.4107 | 47.0 | 18330 | 2.4197 | 1.0 | 59.1548 | 6.0582 |
| 2.4954 | 48.0 | 18720 | 2.4161 | 1.0 | 58.2660 | 6.0710 |
| 2.4066 | 49.0 | 19110 | 2.4099 | 1.0 | 58.7295 | 6.0844 |
| 2.4026 | 50.0 | 19500 | 2.4075 | 1.0 | 56.8618 | 6.1217 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/53206cd9f8a5c89a5c063d2df26e6587
Base model
google/mt5-small