eea42195f3d59ca4da361c5aaaccc2d5
This model is a fine-tuned version of google/mt5-small on the Helsinki-NLP/opus_books [en-it] dataset. It achieves the following results on the evaluation set:
- Loss: 2.0907
- Data Size: 1.0
- Epoch Runtime: 119.0935
- Bleu: 4.9392
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 26.7429 | 0 | 10.4804 | 0.0025 |
| No log | 1 | 808 | 23.2439 | 0.0078 | 11.5344 | 0.0026 |
| No log | 2 | 1616 | 16.3456 | 0.0156 | 12.8278 | 0.0035 |
| No log | 3 | 2424 | 11.1294 | 0.0312 | 15.0454 | 0.0080 |
| 0.5325 | 4 | 3232 | 7.4730 | 0.0625 | 18.0596 | 0.0126 |
| 8.1758 | 5 | 4040 | 5.1044 | 0.125 | 24.7052 | 0.0089 |
| 4.8679 | 6 | 4848 | 3.4269 | 0.25 | 38.8175 | 0.4411 |
| 4.0404 | 7 | 5656 | 3.1126 | 0.5 | 65.5406 | 0.9482 |
| 3.6995 | 8.0 | 6464 | 2.8786 | 1.0 | 122.8830 | 1.4482 |
| 3.4488 | 9.0 | 7272 | 2.7572 | 1.0 | 120.1845 | 1.8188 |
| 3.3188 | 10.0 | 8080 | 2.6761 | 1.0 | 119.4552 | 2.0977 |
| 3.218 | 11.0 | 8888 | 2.6154 | 1.0 | 118.6943 | 2.3441 |
| 3.0844 | 12.0 | 9696 | 2.5652 | 1.0 | 117.9412 | 2.5308 |
| 3.055 | 13.0 | 10504 | 2.5266 | 1.0 | 118.7930 | 2.7109 |
| 2.9957 | 14.0 | 11312 | 2.4910 | 1.0 | 117.7511 | 2.8970 |
| 2.9387 | 15.0 | 12120 | 2.4616 | 1.0 | 117.9348 | 3.0056 |
| 2.909 | 16.0 | 12928 | 2.4300 | 1.0 | 120.4783 | 3.1653 |
| 2.8459 | 17.0 | 13736 | 2.4048 | 1.0 | 119.8377 | 3.3012 |
| 2.7946 | 18.0 | 14544 | 2.3831 | 1.0 | 119.1125 | 3.3432 |
| 2.7558 | 19.0 | 15352 | 2.3660 | 1.0 | 118.0165 | 3.4575 |
| 2.7101 | 20.0 | 16160 | 2.3451 | 1.0 | 118.5927 | 3.5280 |
| 2.6871 | 21.0 | 16968 | 2.3291 | 1.0 | 117.7284 | 3.6822 |
| 2.6403 | 22.0 | 17776 | 2.3141 | 1.0 | 118.3463 | 3.7477 |
| 2.6353 | 23.0 | 18584 | 2.2937 | 1.0 | 118.7333 | 3.8137 |
| 2.5808 | 24.0 | 19392 | 2.2826 | 1.0 | 119.5167 | 3.8677 |
| 2.5392 | 25.0 | 20200 | 2.2667 | 1.0 | 120.4301 | 3.9240 |
| 2.5302 | 26.0 | 21008 | 2.2532 | 1.0 | 120.6806 | 3.9855 |
| 2.5152 | 27.0 | 21816 | 2.2484 | 1.0 | 118.2256 | 4.0550 |
| 2.4817 | 28.0 | 22624 | 2.2330 | 1.0 | 119.0125 | 4.0892 |
| 2.4721 | 29.0 | 23432 | 2.2280 | 1.0 | 118.4721 | 4.1392 |
| 2.4041 | 30.0 | 24240 | 2.2160 | 1.0 | 118.0716 | 4.1792 |
| 2.4295 | 31.0 | 25048 | 2.2086 | 1.0 | 119.0087 | 4.2651 |
| 2.4075 | 32.0 | 25856 | 2.1984 | 1.0 | 120.8782 | 4.2717 |
| 2.4039 | 33.0 | 26664 | 2.1881 | 1.0 | 120.7347 | 4.3501 |
| 2.3536 | 34.0 | 27472 | 2.1791 | 1.0 | 120.3007 | 4.3648 |
| 2.3221 | 35.0 | 28280 | 2.1699 | 1.0 | 119.2118 | 4.4064 |
| 2.3602 | 36.0 | 29088 | 2.1696 | 1.0 | 119.3953 | 4.4432 |
| 2.3203 | 37.0 | 29896 | 2.1568 | 1.0 | 118.4084 | 4.4994 |
| 2.301 | 38.0 | 30704 | 2.1498 | 1.0 | 118.5280 | 4.5255 |
| 2.2393 | 39.0 | 31512 | 2.1470 | 1.0 | 118.5955 | 4.5601 |
| 2.265 | 40.0 | 32320 | 2.1366 | 1.0 | 118.7773 | 4.5875 |
| 2.2429 | 41.0 | 33128 | 2.1345 | 1.0 | 119.2913 | 4.6291 |
| 2.2311 | 42.0 | 33936 | 2.1315 | 1.0 | 121.1980 | 4.6798 |
| 2.1934 | 43.0 | 34744 | 2.1214 | 1.0 | 119.8991 | 4.7206 |
| 2.1981 | 44.0 | 35552 | 2.1188 | 1.0 | 120.9426 | 4.7160 |
| 2.1768 | 45.0 | 36360 | 2.1094 | 1.0 | 119.6998 | 4.7537 |
| 2.2005 | 46.0 | 37168 | 2.1056 | 1.0 | 119.4823 | 4.8064 |
| 2.1601 | 47.0 | 37976 | 2.0986 | 1.0 | 118.5197 | 4.8277 |
| 2.1152 | 48.0 | 38784 | 2.0929 | 1.0 | 118.7602 | 4.8407 |
| 2.1164 | 49.0 | 39592 | 2.0906 | 1.0 | 119.1980 | 4.9085 |
| 2.117 | 50.0 | 40400 | 2.0907 | 1.0 | 119.0935 | 4.9392 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/eea42195f3d59ca4da361c5aaaccc2d5
Base model
google/mt5-small