2b0c94697808ec3de2d5f1b05bf05849
This model is a fine-tuned version of google/umt5-small on the Helsinki-NLP/opus_books [de-pt] dataset. It achieves the following results on the evaluation set:
- Loss: 3.1001
- Data Size: 1.0
- Epoch Runtime: 6.2271
- Bleu: 4.1387
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 16.2689 | 0 | 1.1554 | 0.4356 |
| No log | 1 | 27 | 15.9789 | 0.0078 | 1.4234 | 0.3459 |
| No log | 2 | 54 | 15.8901 | 0.0156 | 1.8919 | 0.3310 |
| No log | 3 | 81 | 15.7364 | 0.0312 | 2.2802 | 0.5092 |
| No log | 4 | 108 | 15.5191 | 0.0625 | 2.2502 | 0.3992 |
| No log | 5 | 135 | 15.1460 | 0.125 | 2.5648 | 0.3839 |
| No log | 6 | 162 | 14.4777 | 0.25 | 3.2894 | 0.3199 |
| No log | 7 | 189 | 13.1401 | 0.5 | 4.6359 | 0.3334 |
| 3.644 | 8.0 | 216 | 11.0998 | 1.0 | 7.0281 | 0.3472 |
| 3.644 | 9.0 | 243 | 9.3550 | 1.0 | 6.8696 | 0.3811 |
| 13.0011 | 10.0 | 270 | 8.0799 | 1.0 | 6.7935 | 0.3923 |
| 13.0011 | 11.0 | 297 | 7.1932 | 1.0 | 7.0847 | 0.2627 |
| 9.796 | 12.0 | 324 | 6.5080 | 1.0 | 7.0138 | 0.3121 |
| 7.9769 | 13.0 | 351 | 5.7633 | 1.0 | 7.6384 | 0.6407 |
| 7.9769 | 14.0 | 378 | 5.0105 | 1.0 | 7.3221 | 0.9045 |
| 6.9403 | 15.0 | 405 | 4.7115 | 1.0 | 5.4665 | 1.2322 |
| 6.9403 | 16.0 | 432 | 4.5795 | 1.0 | 5.9862 | 2.2563 |
| 6.2452 | 17.0 | 459 | 4.4623 | 1.0 | 6.2203 | 2.7302 |
| 6.2452 | 18.0 | 486 | 4.3420 | 1.0 | 6.2215 | 3.2068 |
| 5.7946 | 19.0 | 513 | 4.2593 | 1.0 | 6.3992 | 3.2921 |
| 5.7946 | 20.0 | 540 | 4.1774 | 1.0 | 7.2450 | 3.2180 |
| 5.491 | 21.0 | 567 | 4.0912 | 1.0 | 6.7042 | 3.3344 |
| 5.491 | 22.0 | 594 | 3.9983 | 1.0 | 6.7033 | 1.6987 |
| 5.2207 | 23.0 | 621 | 3.9346 | 1.0 | 7.0602 | 1.3817 |
| 5.2207 | 24.0 | 648 | 3.8750 | 1.0 | 7.1286 | 1.3053 |
| 5.0132 | 25.0 | 675 | 3.8021 | 1.0 | 7.0340 | 1.3828 |
| 4.8139 | 26.0 | 702 | 3.7475 | 1.0 | 7.4569 | 1.4485 |
| 4.8139 | 27.0 | 729 | 3.6965 | 1.0 | 7.4954 | 1.5240 |
| 4.6791 | 28.0 | 756 | 3.6504 | 1.0 | 7.3945 | 1.3944 |
| 4.6791 | 29.0 | 783 | 3.5977 | 1.0 | 7.9859 | 1.1471 |
| 4.524 | 30.0 | 810 | 3.5524 | 1.0 | 5.7725 | 1.0695 |
| 4.524 | 31.0 | 837 | 3.5003 | 1.0 | 6.2132 | 1.0906 |
| 4.3751 | 32.0 | 864 | 3.4662 | 1.0 | 6.2070 | 1.1368 |
| 4.3751 | 33.0 | 891 | 3.4301 | 1.0 | 6.0684 | 1.1827 |
| 4.2701 | 34.0 | 918 | 3.3905 | 1.0 | 6.2441 | 1.2379 |
| 4.2701 | 35.0 | 945 | 3.3597 | 1.0 | 6.5993 | 1.2587 |
| 4.1623 | 36.0 | 972 | 3.3262 | 1.0 | 6.9220 | 1.4276 |
| 4.1623 | 37.0 | 999 | 3.3008 | 1.0 | 6.9956 | 4.1191 |
| 4.0752 | 38.0 | 1026 | 3.2733 | 1.0 | 6.9088 | 4.6422 |
| 3.9817 | 39.0 | 1053 | 3.2551 | 1.0 | 7.2268 | 4.2248 |
| 3.9817 | 40.0 | 1080 | 3.2284 | 1.0 | 7.5810 | 3.8563 |
| 3.9155 | 41.0 | 1107 | 3.2186 | 1.0 | 7.8967 | 3.9903 |
| 3.9155 | 42.0 | 1134 | 3.2020 | 1.0 | 7.5043 | 3.8938 |
| 3.8444 | 43.0 | 1161 | 3.1908 | 1.0 | 7.5884 | 3.9440 |
| 3.8444 | 44.0 | 1188 | 3.1702 | 1.0 | 8.4088 | 3.9587 |
| 3.7853 | 45.0 | 1215 | 3.1521 | 1.0 | 5.7945 | 4.0313 |
| 3.7853 | 46.0 | 1242 | 3.1432 | 1.0 | 5.7534 | 4.0525 |
| 3.7083 | 47.0 | 1269 | 3.1282 | 1.0 | 5.7461 | 4.0782 |
| 3.7083 | 48.0 | 1296 | 3.1237 | 1.0 | 5.8734 | 4.1569 |
| 3.6513 | 49.0 | 1323 | 3.1040 | 1.0 | 6.1703 | 4.1645 |
| 3.5786 | 50.0 | 1350 | 3.1001 | 1.0 | 6.2271 | 4.1387 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for contemmcm/2b0c94697808ec3de2d5f1b05bf05849
Base model
google/umt5-small