acbacdfa50c20b11ca4810175f066222
This model is a fine-tuned version of google/umt5-small on the Helsinki-NLP/opus_books [en-es] dataset. It achieves the following results on the evaluation set:
- Loss: 2.2036
- Data Size: 1.0
- Epoch Runtime: 369.2799
- Bleu: 7.6026
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 16.1008 | 0 | 30.7191 | 0.1376 |
| No log | 1 | 2336 | 15.4111 | 0.0078 | 33.9916 | 0.1382 |
| 0.2628 | 2 | 4672 | 11.6741 | 0.0156 | 36.9915 | 0.1409 |
| 0.3107 | 3 | 7008 | 7.5072 | 0.0312 | 42.3120 | 0.2862 |
| 6.9196 | 4 | 9344 | 4.6997 | 0.0625 | 53.1233 | 1.4737 |
| 5.3873 | 5 | 11680 | 3.9288 | 0.125 | 73.4840 | 2.1529 |
| 4.487 | 6 | 14016 | 3.4148 | 0.25 | 116.9824 | 2.5966 |
| 4.0544 | 7 | 16352 | 3.1414 | 0.5 | 199.2163 | 3.2885 |
| 3.6842 | 8.0 | 18688 | 2.9427 | 1.0 | 364.7632 | 4.0592 |
| 3.4452 | 9.0 | 21024 | 2.8255 | 1.0 | 364.2480 | 4.5240 |
| 3.3093 | 10.0 | 23360 | 2.7579 | 1.0 | 363.0636 | 4.8231 |
| 3.2422 | 11.0 | 25696 | 2.7009 | 1.0 | 364.7663 | 5.0897 |
| 3.1326 | 12.0 | 28032 | 2.6510 | 1.0 | 364.8785 | 5.3149 |
| 3.0436 | 13.0 | 30368 | 2.6195 | 1.0 | 362.5315 | 5.4792 |
| 3.0106 | 14.0 | 32704 | 2.5788 | 1.0 | 362.0513 | 5.6407 |
| 2.9365 | 15.0 | 35040 | 2.5521 | 1.0 | 362.4158 | 5.7836 |
| 2.9626 | 16.0 | 37376 | 2.5179 | 1.0 | 362.7865 | 5.9337 |
| 2.8353 | 17.0 | 39712 | 2.4982 | 1.0 | 365.3876 | 6.0555 |
| 2.8446 | 18.0 | 42048 | 2.4835 | 1.0 | 364.7898 | 6.1210 |
| 2.8111 | 19.0 | 44384 | 2.4641 | 1.0 | 371.6591 | 6.2001 |
| 2.7674 | 20.0 | 46720 | 2.4445 | 1.0 | 383.2359 | 6.2968 |
| 2.7096 | 21.0 | 49056 | 2.4256 | 1.0 | 385.3468 | 6.3968 |
| 2.6922 | 22.0 | 51392 | 2.4156 | 1.0 | 385.5504 | 6.4567 |
| 2.652 | 23.0 | 53728 | 2.3956 | 1.0 | 384.0397 | 6.5348 |
| 2.5982 | 24.0 | 56064 | 2.3923 | 1.0 | 381.7050 | 6.5862 |
| 2.6054 | 25.0 | 58400 | 2.3686 | 1.0 | 386.3455 | 6.6673 |
| 2.5553 | 26.0 | 60736 | 2.3560 | 1.0 | 383.9699 | 6.7479 |
| 2.5704 | 27.0 | 63072 | 2.3517 | 1.0 | 379.6998 | 6.7864 |
| 2.5291 | 28.0 | 65408 | 2.3349 | 1.0 | 385.1407 | 6.8487 |
| 2.495 | 29.0 | 67744 | 2.3311 | 1.0 | 380.7480 | 6.8744 |
| 2.5026 | 30.0 | 70080 | 2.3220 | 1.0 | 367.5770 | 6.9267 |
| 2.4608 | 31.0 | 72416 | 2.3145 | 1.0 | 363.9416 | 6.9729 |
| 2.4994 | 32.0 | 74752 | 2.2948 | 1.0 | 366.9948 | 7.0238 |
| 2.4301 | 33.0 | 77088 | 2.2908 | 1.0 | 366.0870 | 7.0714 |
| 2.4414 | 34.0 | 79424 | 2.2933 | 1.0 | 366.1123 | 7.1139 |
| 2.4087 | 35.0 | 81760 | 2.2787 | 1.0 | 364.9032 | 7.1458 |
| 2.4174 | 36.0 | 84096 | 2.2691 | 1.0 | 364.1023 | 7.1781 |
| 2.3635 | 37.0 | 86432 | 2.2636 | 1.0 | 362.4927 | 7.2447 |
| 2.3803 | 38.0 | 88768 | 2.2673 | 1.0 | 365.5410 | 7.2437 |
| 2.3771 | 39.0 | 91104 | 2.2526 | 1.0 | 366.7453 | 7.2992 |
| 2.365 | 40.0 | 93440 | 2.2541 | 1.0 | 364.3133 | 7.3154 |
| 2.339 | 41.0 | 95776 | 2.2431 | 1.0 | 366.3519 | 7.3661 |
| 2.3013 | 42.0 | 98112 | 2.2383 | 1.0 | 370.7902 | 7.3864 |
| 2.2845 | 43.0 | 100448 | 2.2416 | 1.0 | 368.4315 | 7.4176 |
| 2.2383 | 44.0 | 102784 | 2.2310 | 1.0 | 365.7991 | 7.4575 |
| 2.2968 | 45.0 | 105120 | 2.2163 | 1.0 | 368.7937 | 7.4876 |
| 2.2331 | 46.0 | 107456 | 2.2186 | 1.0 | 376.4598 | 7.5047 |
| 2.2592 | 47.0 | 109792 | 2.2192 | 1.0 | 368.2389 | 7.5199 |
| 2.2147 | 48.0 | 112128 | 2.2200 | 1.0 | 370.3951 | 7.5408 |
| 2.2242 | 49.0 | 114464 | 2.2059 | 1.0 | 373.3496 | 7.5755 |
| 2.2101 | 50.0 | 116800 | 2.2036 | 1.0 | 369.2799 | 7.6026 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/acbacdfa50c20b11ca4810175f066222
Base model
google/umt5-small