5161859226dc3ff6ffdf0fbcb1427ed5
This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [es-no] dataset. It achieves the following results on the evaluation set:
- Loss: 2.6158
- Data Size: 1.0
- Epoch Runtime: 25.0782
- Bleu: 5.5614
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 13.4736 | 0 | 2.4774 | 0.0490 |
| No log | 1 | 89 | 13.4180 | 0.0078 | 3.0967 | 0.0449 |
| No log | 2 | 178 | 13.4598 | 0.0156 | 3.4171 | 0.0455 |
| No log | 3 | 267 | 13.5142 | 0.0312 | 4.7365 | 0.0499 |
| No log | 4 | 356 | 12.4614 | 0.0625 | 6.2108 | 0.0586 |
| No log | 5 | 445 | 12.2575 | 0.125 | 7.7312 | 0.0415 |
| 1.2122 | 6 | 534 | 11.8622 | 0.25 | 11.3834 | 0.0697 |
| 5.6492 | 7 | 623 | 10.4287 | 0.5 | 15.5494 | 0.0763 |
| 9.7418 | 8.0 | 712 | 5.9765 | 1.0 | 25.2768 | 0.3937 |
| 5.6579 | 9.0 | 801 | 3.8992 | 1.0 | 23.8734 | 3.0899 |
| 5.0175 | 10.0 | 890 | 3.2819 | 1.0 | 25.0241 | 2.2056 |
| 4.2889 | 11.0 | 979 | 3.0612 | 1.0 | 26.2374 | 3.1157 |
| 3.8859 | 12.0 | 1068 | 2.9289 | 1.0 | 24.4179 | 3.5846 |
| 3.6268 | 13.0 | 1157 | 2.8582 | 1.0 | 24.0828 | 3.9279 |
| 3.531 | 14.0 | 1246 | 2.8099 | 1.0 | 24.2889 | 4.1136 |
| 3.3838 | 15.0 | 1335 | 2.7777 | 1.0 | 23.4939 | 4.3126 |
| 3.2203 | 16.0 | 1424 | 2.7359 | 1.0 | 24.5260 | 4.5416 |
| 3.1463 | 17.0 | 1513 | 2.7232 | 1.0 | 24.1557 | 4.4583 |
| 3.037 | 18.0 | 1602 | 2.6943 | 1.0 | 23.9128 | 4.6938 |
| 2.9699 | 19.0 | 1691 | 2.6611 | 1.0 | 23.5548 | 4.7951 |
| 2.897 | 20.0 | 1780 | 2.6484 | 1.0 | 23.8592 | 4.9567 |
| 2.7972 | 21.0 | 1869 | 2.6338 | 1.0 | 24.4856 | 5.1006 |
| 2.745 | 22.0 | 1958 | 2.6370 | 1.0 | 24.3166 | 5.2325 |
| 2.7041 | 23.0 | 2047 | 2.6280 | 1.0 | 23.9898 | 5.1784 |
| 2.6238 | 24.0 | 2136 | 2.6149 | 1.0 | 24.5929 | 5.3658 |
| 2.5732 | 25.0 | 2225 | 2.6170 | 1.0 | 24.2029 | 5.4342 |
| 2.5119 | 26.0 | 2314 | 2.6038 | 1.0 | 24.9785 | 5.3348 |
| 2.467 | 27.0 | 2403 | 2.6113 | 1.0 | 25.0814 | 5.4837 |
| 2.4478 | 28.0 | 2492 | 2.6164 | 1.0 | 23.2993 | 5.4099 |
| 2.3915 | 29.0 | 2581 | 2.5982 | 1.0 | 23.9281 | 5.5693 |
| 2.333 | 30.0 | 2670 | 2.6042 | 1.0 | 24.8486 | 5.5377 |
| 2.2747 | 31.0 | 2759 | 2.6044 | 1.0 | 26.2227 | 5.5426 |
| 2.2381 | 32.0 | 2848 | 2.6109 | 1.0 | 23.9335 | 5.5353 |
| 2.2022 | 33.0 | 2937 | 2.6158 | 1.0 | 25.0782 | 5.5614 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for contemmcm/5161859226dc3ff6ffdf0fbcb1427ed5
Base model
google/umt5-base