495d083553ea26e0b3e72790ac20f2d9
This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [fr-ru] dataset. It achieves the following results on the evaluation set:
- Loss: 2.2566
- Data Size: 1.0
- Epoch Runtime: 49.6563
- Bleu: 6.0688
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 11.6452 | 0 | 4.4647 | 0.1107 |
| No log | 1 | 204 | 11.4048 | 0.0078 | 5.7337 | 0.1174 |
| No log | 2 | 408 | 11.0628 | 0.0156 | 6.4351 | 0.1286 |
| No log | 3 | 612 | 10.9644 | 0.0312 | 7.8922 | 0.0921 |
| No log | 4 | 816 | 10.7670 | 0.0625 | 9.8150 | 0.1224 |
| No log | 5 | 1020 | 9.8876 | 0.125 | 12.7489 | 0.1472 |
| 1.187 | 6 | 1224 | 7.7013 | 0.25 | 16.7270 | 0.2060 |
| 7.6984 | 7 | 1428 | 4.7565 | 0.5 | 29.1805 | 0.9289 |
| 4.6198 | 8.0 | 1632 | 3.0309 | 1.0 | 49.1980 | 2.6826 |
| 3.783 | 9.0 | 1836 | 2.7047 | 1.0 | 50.0262 | 3.2899 |
| 3.5023 | 10.0 | 2040 | 2.5827 | 1.0 | 49.3992 | 3.8769 |
| 3.2628 | 11.0 | 2244 | 2.5031 | 1.0 | 49.0810 | 4.1747 |
| 3.0815 | 12.0 | 2448 | 2.4465 | 1.0 | 48.8125 | 4.3110 |
| 2.9673 | 13.0 | 2652 | 2.3992 | 1.0 | 49.0398 | 4.5926 |
| 2.8597 | 14.0 | 2856 | 2.3832 | 1.0 | 48.0334 | 4.6512 |
| 2.7586 | 15.0 | 3060 | 2.3553 | 1.0 | 49.4713 | 4.8019 |
| 2.6723 | 16.0 | 3264 | 2.3256 | 1.0 | 48.0683 | 4.9708 |
| 2.614 | 17.0 | 3468 | 2.3105 | 1.0 | 49.5206 | 5.0827 |
| 2.5792 | 18.0 | 3672 | 2.2991 | 1.0 | 48.5818 | 5.1917 |
| 2.4923 | 19.0 | 3876 | 2.2872 | 1.0 | 50.5722 | 5.2880 |
| 2.4196 | 20.0 | 4080 | 2.2745 | 1.0 | 48.4871 | 5.4170 |
| 2.3663 | 21.0 | 4284 | 2.2646 | 1.0 | 49.0477 | 5.5240 |
| 2.2997 | 22.0 | 4488 | 2.2606 | 1.0 | 51.1142 | 5.5374 |
| 2.266 | 23.0 | 4692 | 2.2522 | 1.0 | 48.3450 | 5.5826 |
| 2.2254 | 24.0 | 4896 | 2.2511 | 1.0 | 49.7791 | 5.6360 |
| 2.1871 | 25.0 | 5100 | 2.2488 | 1.0 | 49.8734 | 5.6304 |
| 2.14 | 26.0 | 5304 | 2.2435 | 1.0 | 50.2850 | 5.6992 |
| 2.0762 | 27.0 | 5508 | 2.2396 | 1.0 | 50.0892 | 5.7920 |
| 2.0515 | 28.0 | 5712 | 2.2392 | 1.0 | 50.0697 | 5.8732 |
| 1.981 | 29.0 | 5916 | 2.2455 | 1.0 | 49.3189 | 5.8811 |
| 1.9533 | 30.0 | 6120 | 2.2473 | 1.0 | 50.7425 | 5.9021 |
| 1.954 | 31.0 | 6324 | 2.2450 | 1.0 | 48.4094 | 6.0477 |
| 1.8985 | 32.0 | 6528 | 2.2566 | 1.0 | 49.6563 | 6.0688 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/495d083553ea26e0b3e72790ac20f2d9
Base model
google/umt5-base