c591ac3fb982261b29782cd25fe3c5a2
This model is a fine-tuned version of google/umt5-small on the Helsinki-NLP/opus_books [es-fr] dataset. It achieves the following results on the evaluation set:
- Loss: 1.7078
- Data Size: 1.0
- Epoch Runtime: 216.9740
- Bleu: 12.6009
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 14.9661 | 0 | 18.4328 | 0.1698 |
| No log | 1 | 1407 | 13.8678 | 0.0078 | 21.2159 | 0.1976 |
| No log | 2 | 2814 | 12.1745 | 0.0156 | 21.7496 | 0.1463 |
| 0.3826 | 3 | 4221 | 10.0626 | 0.0312 | 25.6574 | 0.1737 |
| 9.4091 | 4 | 5628 | 6.0042 | 0.0625 | 31.2974 | 0.4278 |
| 5.7989 | 5 | 7035 | 3.7447 | 0.125 | 43.6643 | 4.9009 |
| 4.5277 | 6 | 8442 | 3.1458 | 0.25 | 69.2407 | 3.8997 |
| 3.7289 | 7 | 9849 | 2.6678 | 0.5 | 117.4917 | 5.3978 |
| 3.2112 | 8.0 | 11256 | 2.4129 | 1.0 | 214.7005 | 6.6841 |
| 2.9436 | 9.0 | 12663 | 2.2953 | 1.0 | 216.1568 | 7.5042 |
| 2.8647 | 10.0 | 14070 | 2.2094 | 1.0 | 213.5292 | 8.1059 |
| 2.6919 | 11.0 | 15477 | 2.1563 | 1.0 | 212.4440 | 8.6020 |
| 2.6117 | 12.0 | 16884 | 2.1075 | 1.0 | 214.2269 | 8.9490 |
| 2.5448 | 13.0 | 18291 | 2.0759 | 1.0 | 213.5605 | 9.2504 |
| 2.5115 | 14.0 | 19698 | 2.0463 | 1.0 | 214.8060 | 9.4776 |
| 2.4257 | 15.0 | 21105 | 2.0095 | 1.0 | 214.7406 | 9.7150 |
| 2.3938 | 16.0 | 22512 | 1.9898 | 1.0 | 215.9306 | 9.8699 |
| 2.3467 | 17.0 | 23919 | 1.9705 | 1.0 | 215.7179 | 10.0840 |
| 2.298 | 18.0 | 25326 | 1.9575 | 1.0 | 215.5572 | 10.2474 |
| 2.2712 | 19.0 | 26733 | 1.9382 | 1.0 | 213.5135 | 10.3838 |
| 2.2264 | 20.0 | 28140 | 1.9135 | 1.0 | 214.5938 | 10.5263 |
| 2.1897 | 21.0 | 29547 | 1.8935 | 1.0 | 216.1186 | 10.6721 |
| 2.167 | 22.0 | 30954 | 1.8883 | 1.0 | 212.1273 | 10.7909 |
| 2.1503 | 23.0 | 32361 | 1.8746 | 1.0 | 217.0957 | 10.8991 |
| 2.0907 | 24.0 | 33768 | 1.8560 | 1.0 | 215.7370 | 11.0596 |
| 2.1052 | 25.0 | 35175 | 1.8475 | 1.0 | 216.2399 | 11.1471 |
| 2.0652 | 26.0 | 36582 | 1.8431 | 1.0 | 214.3941 | 11.2386 |
| 2.0244 | 27.0 | 37989 | 1.8248 | 1.0 | 216.7905 | 11.3224 |
| 2.0077 | 28.0 | 39396 | 1.8150 | 1.0 | 215.1501 | 11.4013 |
| 2.0417 | 29.0 | 40803 | 1.8087 | 1.0 | 214.4889 | 11.4483 |
| 2.008 | 30.0 | 42210 | 1.8023 | 1.0 | 213.1100 | 11.5161 |
| 1.9606 | 31.0 | 43617 | 1.7926 | 1.0 | 214.1516 | 11.6453 |
| 1.9298 | 32.0 | 45024 | 1.7977 | 1.0 | 214.0810 | 11.6786 |
| 1.938 | 33.0 | 46431 | 1.7829 | 1.0 | 223.3107 | 11.7729 |
| 1.8941 | 34.0 | 47838 | 1.7701 | 1.0 | 223.4652 | 11.8105 |
| 1.9073 | 35.0 | 49245 | 1.7751 | 1.0 | 221.6524 | 11.8768 |
| 1.8586 | 36.0 | 50652 | 1.7664 | 1.0 | 223.5310 | 11.9486 |
| 1.8678 | 37.0 | 52059 | 1.7554 | 1.0 | 217.4213 | 12.0003 |
| 1.8227 | 38.0 | 53466 | 1.7456 | 1.0 | 217.4962 | 12.0672 |
| 1.7791 | 39.0 | 54873 | 1.7452 | 1.0 | 220.3917 | 12.0985 |
| 1.8189 | 40.0 | 56280 | 1.7439 | 1.0 | 217.9907 | 12.1608 |
| 1.8328 | 41.0 | 57687 | 1.7462 | 1.0 | 217.8165 | 12.2203 |
| 1.8378 | 42.0 | 59094 | 1.7370 | 1.0 | 217.3545 | 12.3155 |
| 1.7768 | 43.0 | 60501 | 1.7386 | 1.0 | 219.9349 | 12.3127 |
| 1.752 | 44.0 | 61908 | 1.7253 | 1.0 | 219.1841 | 12.3925 |
| 1.7366 | 45.0 | 63315 | 1.7222 | 1.0 | 216.8871 | 12.4186 |
| 1.6971 | 46.0 | 64722 | 1.7219 | 1.0 | 216.8452 | 12.4796 |
| 1.7429 | 47.0 | 66129 | 1.7136 | 1.0 | 219.9515 | 12.5299 |
| 1.6928 | 48.0 | 67536 | 1.7111 | 1.0 | 217.4749 | 12.5543 |
| 1.6711 | 49.0 | 68943 | 1.7052 | 1.0 | 218.2828 | 12.5817 |
| 1.6935 | 50.0 | 70350 | 1.7078 | 1.0 | 216.9740 | 12.6009 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/c591ac3fb982261b29782cd25fe3c5a2
Base model
google/umt5-small