520b14835eced208a78ef9e8f2f99d7a
This model is a fine-tuned version of google/umt5-small on the Helsinki-NLP/opus_books [fr-it] dataset. It achieves the following results on the evaluation set:
- Loss: 2.7937
- Data Size: 1.0
- Epoch Runtime: 58.5370
- Bleu: 5.6361
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 17.0743 | 0 | 5.5982 | 0.2047 |
| No log | 1 | 367 | 16.4893 | 0.0078 | 6.0438 | 0.2076 |
| No log | 2 | 734 | 15.6521 | 0.0156 | 6.8942 | 0.2252 |
| No log | 3 | 1101 | 14.1151 | 0.0312 | 8.2651 | 0.2289 |
| No log | 4 | 1468 | 11.6662 | 0.0625 | 9.6650 | 0.2161 |
| 0.7599 | 5 | 1835 | 8.2824 | 0.125 | 12.6107 | 0.2983 |
| 8.6202 | 6 | 2202 | 5.2964 | 0.25 | 19.1086 | 0.7094 |
| 6.2023 | 7 | 2569 | 4.3212 | 0.5 | 31.9172 | 2.9934 |
| 5.0974 | 8.0 | 2936 | 3.8220 | 1.0 | 57.6145 | 2.1554 |
| 4.6036 | 9.0 | 3303 | 3.5523 | 1.0 | 57.8272 | 2.6853 |
| 4.3248 | 10.0 | 3670 | 3.4210 | 1.0 | 59.7893 | 3.0457 |
| 4.1866 | 11.0 | 4037 | 3.3420 | 1.0 | 58.6285 | 3.2571 |
| 4.0657 | 12.0 | 4404 | 3.2866 | 1.0 | 57.6160 | 3.4933 |
| 4.0059 | 13.0 | 4771 | 3.2401 | 1.0 | 58.4404 | 3.6619 |
| 3.8718 | 14.0 | 5138 | 3.2015 | 1.0 | 57.9625 | 3.8115 |
| 3.8261 | 15.0 | 5505 | 3.1692 | 1.0 | 58.4455 | 3.9516 |
| 3.7285 | 16.0 | 5872 | 3.1352 | 1.0 | 59.0941 | 4.0884 |
| 3.6943 | 17.0 | 6239 | 3.1219 | 1.0 | 58.3599 | 4.1323 |
| 3.6541 | 18.0 | 6606 | 3.0931 | 1.0 | 59.2689 | 4.2474 |
| 3.6291 | 19.0 | 6973 | 3.0716 | 1.0 | 59.4495 | 4.3364 |
| 3.5636 | 20.0 | 7340 | 3.0412 | 1.0 | 58.0654 | 4.4187 |
| 3.5061 | 21.0 | 7707 | 3.0389 | 1.0 | 57.7992 | 4.4748 |
| 3.4734 | 22.0 | 8074 | 3.0219 | 1.0 | 57.9885 | 4.5529 |
| 3.4102 | 23.0 | 8441 | 3.0044 | 1.0 | 58.9918 | 4.6240 |
| 3.3814 | 24.0 | 8808 | 2.9803 | 1.0 | 59.7393 | 4.7050 |
| 3.3919 | 25.0 | 9175 | 2.9830 | 1.0 | 58.7152 | 4.7763 |
| 3.2983 | 26.0 | 9542 | 2.9674 | 1.0 | 58.7490 | 4.8029 |
| 3.2863 | 27.0 | 9909 | 2.9622 | 1.0 | 59.5547 | 4.8354 |
| 3.2594 | 28.0 | 10276 | 2.9418 | 1.0 | 59.9988 | 4.8734 |
| 3.2504 | 29.0 | 10643 | 2.9263 | 1.0 | 59.5368 | 4.9443 |
| 3.2417 | 30.0 | 11010 | 2.9301 | 1.0 | 58.6146 | 4.9528 |
| 3.1811 | 31.0 | 11377 | 2.9141 | 1.0 | 58.3594 | 5.0152 |
| 3.1766 | 32.0 | 11744 | 2.9059 | 1.0 | 58.9723 | 5.0389 |
| 3.1313 | 33.0 | 12111 | 2.8911 | 1.0 | 58.4011 | 5.0914 |
| 3.1139 | 34.0 | 12478 | 2.8956 | 1.0 | 59.8377 | 5.1396 |
| 3.0672 | 35.0 | 12845 | 2.8871 | 1.0 | 59.5639 | 5.1947 |
| 3.0823 | 36.0 | 13212 | 2.8757 | 1.0 | 58.4524 | 5.2140 |
| 3.049 | 37.0 | 13579 | 2.8646 | 1.0 | 58.3569 | 5.2484 |
| 3.0445 | 38.0 | 13946 | 2.8729 | 1.0 | 58.3784 | 5.3178 |
| 3.0336 | 39.0 | 14313 | 2.8476 | 1.0 | 59.2942 | 5.3525 |
| 2.9965 | 40.0 | 14680 | 2.8526 | 1.0 | 58.9237 | 5.3645 |
| 2.9969 | 41.0 | 15047 | 2.8385 | 1.0 | 58.7512 | 5.3852 |
| 2.9535 | 42.0 | 15414 | 2.8423 | 1.0 | 58.8361 | 5.4229 |
| 2.932 | 43.0 | 15781 | 2.8336 | 1.0 | 58.5391 | 5.4622 |
| 2.923 | 44.0 | 16148 | 2.8279 | 1.0 | 59.8299 | 5.4951 |
| 2.9285 | 45.0 | 16515 | 2.8244 | 1.0 | 59.5493 | 5.5138 |
| 2.9258 | 46.0 | 16882 | 2.8144 | 1.0 | 58.9045 | 5.5251 |
| 2.8831 | 47.0 | 17249 | 2.8164 | 1.0 | 59.3035 | 5.5434 |
| 2.8739 | 48.0 | 17616 | 2.8114 | 1.0 | 58.5114 | 5.6106 |
| 2.8258 | 49.0 | 17983 | 2.8138 | 1.0 | 60.7856 | 5.6095 |
| 2.8659 | 50.0 | 18350 | 2.7937 | 1.0 | 58.5370 | 5.6361 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for contemmcm/520b14835eced208a78ef9e8f2f99d7a
Base model
google/umt5-small