94fdaedfc9b48f0ed0b5bf29cb185c4e
This model is a fine-tuned version of google/umt5-small on the Helsinki-NLP/opus_books [fi-fr] dataset. It achieves the following results on the evaluation set:
- Loss: 2.9088
- Data Size: 1.0
- Epoch Runtime: 16.3464
- Bleu: 2.6969
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 17.5427 | 0 | 1.9032 | 0.0327 |
| No log | 1 | 88 | 17.6538 | 0.0078 | 2.3328 | 0.0366 |
| No log | 2 | 176 | 17.6828 | 0.0156 | 2.6445 | 0.0330 |
| No log | 3 | 264 | 17.1684 | 0.0312 | 3.2106 | 0.0415 |
| No log | 4 | 352 | 16.5233 | 0.0625 | 4.1401 | 0.0369 |
| No log | 5 | 440 | 15.7354 | 0.125 | 4.9029 | 0.0416 |
| 1.4026 | 6 | 528 | 13.0467 | 0.25 | 6.2121 | 0.0460 |
| 5.2705 | 7 | 616 | 9.4958 | 0.5 | 9.2868 | 0.0785 |
| 9.3402 | 8.0 | 704 | 6.3986 | 1.0 | 15.7509 | 0.1740 |
| 7.883 | 9.0 | 792 | 4.8116 | 1.0 | 15.0764 | 0.2808 |
| 6.3317 | 10.0 | 880 | 4.4173 | 1.0 | 15.4257 | 0.7106 |
| 5.6835 | 11.0 | 968 | 4.2185 | 1.0 | 15.6292 | 0.6120 |
| 5.2724 | 12.0 | 1056 | 4.0331 | 1.0 | 16.0407 | 0.6033 |
| 5.1159 | 13.0 | 1144 | 3.8686 | 1.0 | 15.7916 | 0.6826 |
| 4.8563 | 14.0 | 1232 | 3.6990 | 1.0 | 16.0089 | 0.8737 |
| 4.6141 | 15.0 | 1320 | 3.5557 | 1.0 | 16.7219 | 1.0237 |
| 4.4255 | 16.0 | 1408 | 3.4427 | 1.0 | 15.0475 | 1.2239 |
| 4.3427 | 17.0 | 1496 | 3.3700 | 1.0 | 16.3657 | 1.2493 |
| 4.213 | 18.0 | 1584 | 3.3003 | 1.0 | 16.0512 | 1.3956 |
| 4.1149 | 19.0 | 1672 | 3.2423 | 1.0 | 15.8517 | 1.5251 |
| 4.0094 | 20.0 | 1760 | 3.1971 | 1.0 | 15.9138 | 1.6057 |
| 3.9619 | 21.0 | 1848 | 3.1592 | 1.0 | 16.2662 | 1.6730 |
| 3.9209 | 22.0 | 1936 | 3.1399 | 1.0 | 16.9278 | 1.7254 |
| 3.8336 | 23.0 | 2024 | 3.1136 | 1.0 | 15.0629 | 1.8268 |
| 3.79 | 24.0 | 2112 | 3.0924 | 1.0 | 15.4468 | 1.9491 |
| 3.7128 | 25.0 | 2200 | 3.0796 | 1.0 | 16.4523 | 1.9698 |
| 3.6885 | 26.0 | 2288 | 3.0701 | 1.0 | 15.4163 | 2.0019 |
| 3.612 | 27.0 | 2376 | 3.0550 | 1.0 | 16.1792 | 2.0334 |
| 3.5716 | 28.0 | 2464 | 3.0425 | 1.0 | 15.9046 | 2.0828 |
| 3.5489 | 29.0 | 2552 | 3.0267 | 1.0 | 15.9448 | 2.1437 |
| 3.508 | 30.0 | 2640 | 3.0195 | 1.0 | 14.9253 | 2.1404 |
| 3.45 | 31.0 | 2728 | 3.0112 | 1.0 | 15.3964 | 2.1229 |
| 3.4475 | 32.0 | 2816 | 3.0026 | 1.0 | 15.4170 | 2.1722 |
| 3.4265 | 33.0 | 2904 | 2.9929 | 1.0 | 15.7752 | 2.2442 |
| 3.3774 | 34.0 | 2992 | 2.9869 | 1.0 | 15.7856 | 2.2355 |
| 3.3686 | 35.0 | 3080 | 2.9780 | 1.0 | 16.1228 | 2.2419 |
| 3.3166 | 36.0 | 3168 | 2.9766 | 1.0 | 16.1321 | 2.2914 |
| 3.3171 | 37.0 | 3256 | 2.9657 | 1.0 | 14.8927 | 2.3810 |
| 3.2881 | 38.0 | 3344 | 2.9582 | 1.0 | 15.4943 | 2.3324 |
| 3.2738 | 39.0 | 3432 | 2.9541 | 1.0 | 16.0475 | 2.4149 |
| 3.2355 | 40.0 | 3520 | 2.9469 | 1.0 | 15.4458 | 2.5267 |
| 3.1962 | 41.0 | 3608 | 2.9466 | 1.0 | 15.3254 | 2.5223 |
| 3.1724 | 42.0 | 3696 | 2.9407 | 1.0 | 15.8035 | 2.5439 |
| 3.1306 | 43.0 | 3784 | 2.9319 | 1.0 | 15.7676 | 2.5258 |
| 3.1571 | 44.0 | 3872 | 2.9286 | 1.0 | 15.8733 | 2.5348 |
| 3.1056 | 45.0 | 3960 | 2.9272 | 1.0 | 15.0118 | 2.5468 |
| 3.0972 | 46.0 | 4048 | 2.9254 | 1.0 | 15.4880 | 2.6190 |
| 3.0718 | 47.0 | 4136 | 2.9178 | 1.0 | 15.5234 | 2.6083 |
| 3.0491 | 48.0 | 4224 | 2.9140 | 1.0 | 15.7977 | 2.6580 |
| 3.041 | 49.0 | 4312 | 2.9099 | 1.0 | 16.3834 | 2.6792 |
| 3.0405 | 50.0 | 4400 | 2.9088 | 1.0 | 16.3464 | 2.6969 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for contemmcm/94fdaedfc9b48f0ed0b5bf29cb185c4e
Base model
google/umt5-small