a93eba97dfea3ff692f3ee62a5a4873a
This model is a fine-tuned version of google/umt5-small on the Helsinki-NLP/opus_books [it-ru] dataset. It achieves the following results on the evaluation set:
- Loss: 1.8385
- Data Size: 1.0
- Epoch Runtime: 71.3281
- Bleu: 9.6363
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 17.1086 | 0 | 6.4889 | 0.2764 |
| No log | 1 | 447 | 16.7062 | 0.0078 | 8.1216 | 0.2387 |
| 0.3244 | 2 | 894 | 15.9956 | 0.0156 | 7.7222 | 0.2189 |
| 0.4401 | 3 | 1341 | 14.9017 | 0.0312 | 8.9240 | 0.2174 |
| 0.6801 | 4 | 1788 | 11.0771 | 0.0625 | 11.1158 | 0.2432 |
| 1.0294 | 5 | 2235 | 7.5604 | 0.125 | 15.3990 | 0.6508 |
| 8.6763 | 6 | 2682 | 4.8568 | 0.25 | 23.1476 | 1.4250 |
| 5.4588 | 7 | 3129 | 3.7579 | 0.5 | 38.1228 | 4.0730 |
| 4.2962 | 8.0 | 3576 | 2.9575 | 1.0 | 70.9136 | 3.4979 |
| 3.7957 | 9.0 | 4023 | 2.6555 | 1.0 | 69.6879 | 4.5745 |
| 3.5378 | 10.0 | 4470 | 2.5277 | 1.0 | 70.0899 | 5.2329 |
| 3.3401 | 11.0 | 4917 | 2.4344 | 1.0 | 70.4763 | 5.6474 |
| 3.2244 | 12.0 | 5364 | 2.3658 | 1.0 | 70.6551 | 5.9999 |
| 3.0924 | 13.0 | 5811 | 2.3067 | 1.0 | 70.2377 | 6.2892 |
| 2.9826 | 14.0 | 6258 | 2.2672 | 1.0 | 71.6834 | 6.5737 |
| 2.9535 | 15.0 | 6705 | 2.2242 | 1.0 | 70.8144 | 6.7462 |
| 2.8689 | 16.0 | 7152 | 2.1933 | 1.0 | 72.8303 | 6.9514 |
| 2.8202 | 17.0 | 7599 | 2.1639 | 1.0 | 70.4897 | 7.1483 |
| 2.7643 | 18.0 | 8046 | 2.1387 | 1.0 | 71.7494 | 7.2870 |
| 2.7199 | 19.0 | 8493 | 2.1163 | 1.0 | 70.9859 | 7.3945 |
| 2.6807 | 20.0 | 8940 | 2.0955 | 1.0 | 73.8687 | 7.5529 |
| 2.6525 | 21.0 | 9387 | 2.0845 | 1.0 | 71.6488 | 7.6848 |
| 2.5591 | 22.0 | 9834 | 2.0597 | 1.0 | 71.9993 | 7.8096 |
| 2.5191 | 23.0 | 10281 | 2.0422 | 1.0 | 70.3985 | 7.9218 |
| 2.4782 | 24.0 | 10728 | 2.0310 | 1.0 | 71.3682 | 8.0161 |
| 2.4537 | 25.0 | 11175 | 2.0124 | 1.0 | 70.6184 | 8.1291 |
| 2.4013 | 26.0 | 11622 | 2.0065 | 1.0 | 71.0686 | 8.1886 |
| 2.444 | 27.0 | 12069 | 1.9869 | 1.0 | 70.3656 | 8.3094 |
| 2.3569 | 28.0 | 12516 | 1.9811 | 1.0 | 71.5175 | 8.3481 |
| 2.303 | 29.0 | 12963 | 1.9685 | 1.0 | 70.8783 | 8.4590 |
| 2.2919 | 30.0 | 13410 | 1.9608 | 1.0 | 72.1665 | 8.4929 |
| 2.283 | 31.0 | 13857 | 1.9447 | 1.0 | 70.0651 | 8.6057 |
| 2.2257 | 32.0 | 14304 | 1.9400 | 1.0 | 71.6964 | 8.6556 |
| 2.2569 | 33.0 | 14751 | 1.9354 | 1.0 | 71.7775 | 8.7164 |
| 2.2044 | 34.0 | 15198 | 1.9189 | 1.0 | 73.0560 | 8.8147 |
| 2.168 | 35.0 | 15645 | 1.9167 | 1.0 | 71.1399 | 8.8828 |
| 2.1329 | 36.0 | 16092 | 1.9045 | 1.0 | 71.0777 | 8.9717 |
| 2.0839 | 37.0 | 16539 | 1.9046 | 1.0 | 71.6059 | 8.9904 |
| 2.1337 | 38.0 | 16986 | 1.8902 | 1.0 | 70.9227 | 9.0529 |
| 2.0959 | 39.0 | 17433 | 1.8842 | 1.0 | 70.6274 | 9.1009 |
| 2.0324 | 40.0 | 17880 | 1.8749 | 1.0 | 70.5578 | 9.1943 |
| 2.0207 | 41.0 | 18327 | 1.8735 | 1.0 | 70.8847 | 9.2158 |
| 1.9929 | 42.0 | 18774 | 1.8672 | 1.0 | 71.3545 | 9.3092 |
| 2.029 | 43.0 | 19221 | 1.8667 | 1.0 | 71.9536 | 9.3463 |
| 1.9606 | 44.0 | 19668 | 1.8621 | 1.0 | 70.7463 | 9.3929 |
| 1.9386 | 45.0 | 20115 | 1.8553 | 1.0 | 70.4038 | 9.4295 |
| 1.9425 | 46.0 | 20562 | 1.8481 | 1.0 | 70.7218 | 9.4897 |
| 1.963 | 47.0 | 21009 | 1.8429 | 1.0 | 71.7010 | 9.5780 |
| 1.9085 | 48.0 | 21456 | 1.8461 | 1.0 | 70.6225 | 9.5692 |
| 1.9333 | 49.0 | 21903 | 1.8318 | 1.0 | 70.7106 | 9.6188 |
| 1.8609 | 50.0 | 22350 | 1.8385 | 1.0 | 71.3281 | 9.6363 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for contemmcm/a93eba97dfea3ff692f3ee62a5a4873a
Base model
google/umt5-small