53a0590d75db293170dd4d8b60a5d42f
This model is a fine-tuned version of google/mt5-small on the Helsinki-NLP/opus_books [es-ru] dataset. It achieves the following results on the evaluation set:
- Loss: 2.2613
- Data Size: 1.0
- Epoch Runtime: 66.5602
- Bleu: 4.2831
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 26.0882 | 0 | 5.7279 | 0.0038 |
| No log | 1 | 419 | 23.4322 | 0.0078 | 6.3451 | 0.0044 |
| No log | 2 | 838 | 19.9518 | 0.0156 | 6.9912 | 0.0045 |
| 0.6263 | 3 | 1257 | 14.4388 | 0.0312 | 8.2809 | 0.0056 |
| 0.6263 | 4 | 1676 | 10.3959 | 0.0625 | 10.2136 | 0.0076 |
| 0.9689 | 5 | 2095 | 6.8759 | 0.125 | 13.8411 | 0.0135 |
| 0.9182 | 6 | 2514 | 4.4206 | 0.25 | 21.4609 | 0.0131 |
| 4.5654 | 7 | 2933 | 3.2744 | 0.5 | 34.5283 | 0.8224 |
| 3.8563 | 8.0 | 3352 | 2.9757 | 1.0 | 62.8554 | 1.2310 |
| 3.6633 | 9.0 | 3771 | 2.8524 | 1.0 | 61.6147 | 1.4882 |
| 3.5277 | 10.0 | 4190 | 2.7693 | 1.0 | 64.6321 | 1.7471 |
| 3.3524 | 11.0 | 4609 | 2.7129 | 1.0 | 62.0201 | 1.9156 |
| 3.312 | 12.0 | 5028 | 2.6630 | 1.0 | 62.0960 | 2.0774 |
| 3.2036 | 13.0 | 5447 | 2.6264 | 1.0 | 63.2139 | 2.2092 |
| 3.195 | 14.0 | 5866 | 2.5942 | 1.0 | 63.6137 | 2.3323 |
| 3.0695 | 15.0 | 6285 | 2.5639 | 1.0 | 66.2644 | 2.4058 |
| 3.0676 | 16.0 | 6704 | 2.5396 | 1.0 | 63.6661 | 2.4857 |
| 3.0128 | 17.0 | 7123 | 2.5184 | 1.0 | 63.6598 | 2.5677 |
| 2.9509 | 18.0 | 7542 | 2.4970 | 1.0 | 63.0249 | 2.6038 |
| 2.919 | 19.0 | 7961 | 2.4749 | 1.0 | 63.8457 | 2.7655 |
| 2.8584 | 20.0 | 8380 | 2.4589 | 1.0 | 62.7789 | 2.8061 |
| 2.8583 | 21.0 | 8799 | 2.4449 | 1.0 | 63.3639 | 2.8779 |
| 2.8146 | 22.0 | 9218 | 2.4299 | 1.0 | 62.3622 | 2.9702 |
| 2.8195 | 23.0 | 9637 | 2.4227 | 1.0 | 63.2174 | 2.9814 |
| 2.7433 | 24.0 | 10056 | 2.4108 | 1.0 | 63.2408 | 3.0786 |
| 2.7329 | 25.0 | 10475 | 2.4014 | 1.0 | 63.0987 | 3.1725 |
| 2.685 | 26.0 | 10894 | 2.3860 | 1.0 | 61.7609 | 3.2540 |
| 2.683 | 27.0 | 11313 | 2.3750 | 1.0 | 63.5046 | 3.3158 |
| 2.6763 | 28.0 | 11732 | 2.3670 | 1.0 | 63.4249 | 3.4072 |
| 2.6408 | 29.0 | 12151 | 2.3599 | 1.0 | 62.8431 | 3.4535 |
| 2.6533 | 30.0 | 12570 | 2.3509 | 1.0 | 66.6050 | 3.4628 |
| 2.5966 | 31.0 | 12989 | 2.3453 | 1.0 | 66.0938 | 3.5460 |
| 2.5748 | 32.0 | 13408 | 2.3354 | 1.0 | 66.2647 | 3.5959 |
| 2.5456 | 33.0 | 13827 | 2.3327 | 1.0 | 64.5781 | 3.6653 |
| 2.5099 | 34.0 | 14246 | 2.3268 | 1.0 | 64.9895 | 3.6784 |
| 2.5427 | 35.0 | 14665 | 2.3130 | 1.0 | 64.4841 | 3.7323 |
| 2.4921 | 36.0 | 15084 | 2.3115 | 1.0 | 64.6163 | 3.8141 |
| 2.4378 | 37.0 | 15503 | 2.3093 | 1.0 | 65.5945 | 3.8321 |
| 2.4358 | 38.0 | 15922 | 2.3043 | 1.0 | 65.0992 | 3.9432 |
| 2.435 | 39.0 | 16341 | 2.2956 | 1.0 | 65.5985 | 3.9072 |
| 2.4036 | 40.0 | 16760 | 2.2981 | 1.0 | 65.2673 | 3.9699 |
| 2.4083 | 41.0 | 17179 | 2.2889 | 1.0 | 66.2053 | 3.9823 |
| 2.3789 | 42.0 | 17598 | 2.2859 | 1.0 | 65.5683 | 4.0351 |
| 2.391 | 43.0 | 18017 | 2.2806 | 1.0 | 66.7639 | 4.2193 |
| 2.3584 | 44.0 | 18436 | 2.2779 | 1.0 | 65.9479 | 4.0464 |
| 2.3229 | 45.0 | 18855 | 2.2772 | 1.0 | 65.8073 | 4.1574 |
| 2.3252 | 46.0 | 19274 | 2.2743 | 1.0 | 66.2474 | 4.1240 |
| 2.3353 | 47.0 | 19693 | 2.2706 | 1.0 | 65.2274 | 4.2012 |
| 2.2846 | 48.0 | 20112 | 2.2657 | 1.0 | 66.5786 | 4.1928 |
| 2.309 | 49.0 | 20531 | 2.2618 | 1.0 | 65.3962 | 4.2712 |
| 2.2756 | 50.0 | 20950 | 2.2613 | 1.0 | 66.5602 | 4.2831 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/53a0590d75db293170dd4d8b60a5d42f
Base model
google/mt5-small