0e61d7a8bf16a3497486736747a532fa
This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [en-fr] dataset. It achieves the following results on the evaluation set:
- Loss: 1.3967
- Data Size: 1.0
- Epoch Runtime: 730.0780
- Bleu: 13.8607
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 11.0050 | 0 | 57.2280 | 0.0913 |
| No log | 1 | 3177 | 9.7755 | 0.0078 | 62.6163 | 0.0603 |
| 0.2428 | 2 | 6354 | 6.4851 | 0.0156 | 69.6654 | 0.4655 |
| 6.1305 | 3 | 9531 | 3.7118 | 0.0312 | 80.4218 | 4.9143 |
| 3.8491 | 4 | 12708 | 2.4897 | 0.0625 | 100.3120 | 7.8902 |
| 3.0563 | 5 | 15885 | 2.2146 | 0.125 | 143.0110 | 6.6310 |
| 2.6465 | 6 | 19062 | 2.0177 | 0.25 | 226.6017 | 7.7631 |
| 2.3886 | 7 | 22239 | 1.8728 | 0.5 | 394.2835 | 8.8893 |
| 2.1613 | 8.0 | 25416 | 1.7347 | 1.0 | 725.5487 | 10.0712 |
| 1.9867 | 9.0 | 28593 | 1.6550 | 1.0 | 725.4036 | 10.7525 |
| 1.9215 | 10.0 | 31770 | 1.6002 | 1.0 | 728.3040 | 11.1860 |
| 1.8169 | 11.0 | 34947 | 1.5578 | 1.0 | 730.4529 | 11.5433 |
| 1.7411 | 12.0 | 38124 | 1.5310 | 1.0 | 729.5908 | 11.8498 |
| 1.6666 | 13.0 | 41301 | 1.5063 | 1.0 | 726.8605 | 12.0989 |
| 1.6302 | 14.0 | 44478 | 1.4838 | 1.0 | 726.3137 | 12.2911 |
| 1.5987 | 15.0 | 47655 | 1.4699 | 1.0 | 723.5461 | 12.4585 |
| 1.52 | 16.0 | 50832 | 1.4574 | 1.0 | 723.2799 | 12.6016 |
| 1.5011 | 17.0 | 54009 | 1.4438 | 1.0 | 723.0908 | 12.7511 |
| 1.4794 | 18.0 | 57186 | 1.4335 | 1.0 | 723.4092 | 12.8899 |
| 1.4367 | 19.0 | 60363 | 1.4250 | 1.0 | 722.8614 | 12.9570 |
| 1.3848 | 20.0 | 63540 | 1.4155 | 1.0 | 724.5252 | 13.0640 |
| 1.3722 | 21.0 | 66717 | 1.4153 | 1.0 | 728.9433 | 13.1557 |
| 1.3477 | 22.0 | 69894 | 1.4033 | 1.0 | 729.3175 | 13.2482 |
| 1.3196 | 23.0 | 73071 | 1.4030 | 1.0 | 723.1824 | 13.3230 |
| 1.3292 | 24.0 | 76248 | 1.3959 | 1.0 | 729.4897 | 13.3777 |
| 1.2922 | 25.0 | 79425 | 1.3921 | 1.0 | 726.9376 | 13.4284 |
| 1.2686 | 26.0 | 82602 | 1.3853 | 1.0 | 729.3467 | 13.5032 |
| 1.2393 | 27.0 | 85779 | 1.3907 | 1.0 | 726.2296 | 13.5276 |
| 1.2307 | 28.0 | 88956 | 1.3850 | 1.0 | 727.6741 | 13.6117 |
| 1.2041 | 29.0 | 92133 | 1.3881 | 1.0 | 726.9224 | 13.6822 |
| 1.1862 | 30.0 | 95310 | 1.3891 | 1.0 | 726.4434 | 13.6620 |
| 1.1582 | 31.0 | 98487 | 1.3991 | 1.0 | 726.7570 | 13.7280 |
| 1.1476 | 32.0 | 101664 | 1.3815 | 1.0 | 727.6579 | 13.7293 |
| 1.1168 | 33.0 | 104841 | 1.3921 | 1.0 | 725.2406 | 13.7680 |
| 1.1249 | 34.0 | 108018 | 1.3928 | 1.0 | 725.8319 | 13.8611 |
| 1.0888 | 35.0 | 111195 | 1.3979 | 1.0 | 728.4666 | 13.8371 |
| 1.0546 | 36.0 | 114372 | 1.3967 | 1.0 | 730.0780 | 13.8607 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/0e61d7a8bf16a3497486736747a532fa
Base model
google/umt5-base