efa1a362d0a41e141591c0a81db31a63
This model is a fine-tuned version of google/mt5-small on the Helsinki-NLP/opus_books [fr-ru] dataset. It achieves the following results on the evaluation set:
- Loss: 2.5158
- Data Size: 1.0
- Epoch Runtime: 30.4151
- Bleu: 3.5464
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 27.8653 | 0 | 3.2901 | 0.0054 |
| No log | 1 | 204 | 27.8858 | 0.0078 | 3.5448 | 0.0052 |
| No log | 2 | 408 | 25.7717 | 0.0156 | 4.5152 | 0.0062 |
| No log | 3 | 612 | 24.1957 | 0.0312 | 5.0243 | 0.0057 |
| No log | 4 | 816 | 18.9655 | 0.0625 | 5.8786 | 0.0074 |
| No log | 5 | 1020 | 12.8708 | 0.125 | 7.4486 | 0.0133 |
| 1.54 | 6 | 1224 | 8.4704 | 0.25 | 10.8623 | 0.0169 |
| 8.8986 | 7 | 1428 | 5.0593 | 0.5 | 17.2985 | 0.0230 |
| 5.0016 | 8.0 | 1632 | 3.5442 | 1.0 | 30.8860 | 0.3609 |
| 4.4604 | 9.0 | 1836 | 3.2882 | 1.0 | 31.3445 | 0.7791 |
| 4.2289 | 10.0 | 2040 | 3.1574 | 1.0 | 30.3314 | 1.0805 |
| 4.0285 | 11.0 | 2244 | 3.0705 | 1.0 | 30.9519 | 1.2622 |
| 3.8953 | 12.0 | 2448 | 3.0100 | 1.0 | 31.2718 | 1.3434 |
| 3.7873 | 13.0 | 2652 | 2.9589 | 1.0 | 30.4523 | 1.5086 |
| 3.7034 | 14.0 | 2856 | 2.9106 | 1.0 | 30.2343 | 1.6267 |
| 3.6363 | 15.0 | 3060 | 2.8808 | 1.0 | 30.6904 | 1.7854 |
| 3.5499 | 16.0 | 3264 | 2.8463 | 1.0 | 31.3114 | 1.8462 |
| 3.4958 | 17.0 | 3468 | 2.8198 | 1.0 | 30.8767 | 1.9228 |
| 3.493 | 18.0 | 3672 | 2.7970 | 1.0 | 31.2220 | 1.9446 |
| 3.4138 | 19.0 | 3876 | 2.7729 | 1.0 | 31.0873 | 2.0582 |
| 3.3651 | 20.0 | 4080 | 2.7533 | 1.0 | 31.4935 | 2.1085 |
| 3.3197 | 21.0 | 4284 | 2.7403 | 1.0 | 32.4768 | 2.1687 |
| 3.2594 | 22.0 | 4488 | 2.7233 | 1.0 | 31.7985 | 2.2121 |
| 3.2295 | 23.0 | 4692 | 2.7061 | 1.0 | 31.3834 | 2.2961 |
| 3.1994 | 24.0 | 4896 | 2.6899 | 1.0 | 30.6403 | 2.3484 |
| 3.1855 | 25.0 | 5100 | 2.6790 | 1.0 | 31.0177 | 2.3975 |
| 3.1176 | 26.0 | 5304 | 2.6684 | 1.0 | 31.9839 | 2.4435 |
| 3.0866 | 27.0 | 5508 | 2.6551 | 1.0 | 30.7518 | 2.4625 |
| 3.0626 | 28.0 | 5712 | 2.6443 | 1.0 | 30.7101 | 2.5293 |
| 3.012 | 29.0 | 5916 | 2.6326 | 1.0 | 30.4698 | 2.5482 |
| 3.0006 | 30.0 | 6120 | 2.6236 | 1.0 | 31.1037 | 2.6454 |
| 2.9874 | 31.0 | 6324 | 2.6130 | 1.0 | 30.9122 | 2.6729 |
| 2.9616 | 32.0 | 6528 | 2.6078 | 1.0 | 30.9464 | 2.7181 |
| 2.9164 | 33.0 | 6732 | 2.6001 | 1.0 | 30.4612 | 2.7440 |
| 2.897 | 34.0 | 6936 | 2.5936 | 1.0 | 31.0837 | 2.8153 |
| 2.8612 | 35.0 | 7140 | 2.5853 | 1.0 | 32.5358 | 2.8709 |
| 2.8367 | 36.0 | 7344 | 2.5835 | 1.0 | 31.5089 | 2.8604 |
| 2.8082 | 37.0 | 7548 | 2.5729 | 1.0 | 31.9901 | 2.8850 |
| 2.8487 | 38.0 | 7752 | 2.5649 | 1.0 | 30.2207 | 2.9948 |
| 2.7835 | 39.0 | 7956 | 2.5607 | 1.0 | 30.4122 | 2.9975 |
| 2.7553 | 40.0 | 8160 | 2.5620 | 1.0 | 30.3458 | 3.0631 |
| 2.7686 | 41.0 | 8364 | 2.5487 | 1.0 | 31.2387 | 3.1216 |
| 2.7392 | 42.0 | 8568 | 2.5425 | 1.0 | 30.2285 | 3.1314 |
| 2.6999 | 43.0 | 8772 | 2.5431 | 1.0 | 31.0260 | 3.2077 |
| 2.6887 | 44.0 | 8976 | 2.5345 | 1.0 | 31.8682 | 3.2224 |
| 2.67 | 45.0 | 9180 | 2.5335 | 1.0 | 31.6668 | 3.2431 |
| 2.6897 | 46.0 | 9384 | 2.5211 | 1.0 | 30.2110 | 3.2613 |
| 2.6501 | 47.0 | 9588 | 2.5234 | 1.0 | 30.1906 | 3.3322 |
| 2.6273 | 48.0 | 9792 | 2.5212 | 1.0 | 31.1244 | 3.4368 |
| 2.6031 | 49.0 | 9996 | 2.5173 | 1.0 | 31.4978 | 3.4740 |
| 2.5725 | 50.0 | 10200 | 2.5158 | 1.0 | 30.4151 | 3.5464 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/efa1a362d0a41e141591c0a81db31a63
Base model
google/mt5-small