d431bd02a55bbd85866f1939f367ddca
This model is a fine-tuned version of google-t5/t5-small on the Helsinki-NLP/opus_books dataset. It achieves the following results on the evaluation set:
- Loss: 2.3415
- Data Size: 1.0
- Epoch Runtime: 6.6882
- Bleu: 3.0177
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 4.6221 | 0 | 1.1473 | 0.1100 |
| No log | 1 | 31 | 4.6064 | 0.0078 | 1.4740 | 0.1104 |
| No log | 2 | 62 | 4.5703 | 0.0156 | 1.4328 | 0.1134 |
| No log | 3 | 93 | 4.5005 | 0.0312 | 1.4964 | 0.1159 |
| No log | 4 | 124 | 4.4422 | 0.0625 | 1.5981 | 0.0962 |
| No log | 5 | 155 | 4.2707 | 0.125 | 2.1272 | 0.1219 |
| No log | 6 | 186 | 4.0733 | 0.25 | 2.6881 | 0.1684 |
| 0.7013 | 7 | 217 | 3.8242 | 0.5 | 3.7759 | 0.2088 |
| 0.7013 | 8.0 | 248 | 3.5560 | 1.0 | 6.4531 | 0.1758 |
| 2.8141 | 9.0 | 279 | 3.3816 | 1.0 | 6.2527 | 0.2138 |
| 3.74 | 10.0 | 310 | 3.2521 | 1.0 | 6.9367 | 0.2372 |
| 3.74 | 11.0 | 341 | 3.1519 | 1.0 | 6.7399 | 0.9490 |
| 3.5558 | 12.0 | 372 | 3.0793 | 1.0 | 6.1199 | 1.0051 |
| 3.4043 | 13.0 | 403 | 3.0140 | 1.0 | 6.2391 | 1.2950 |
| 3.4043 | 14.0 | 434 | 2.9645 | 1.0 | 7.1239 | 1.2945 |
| 3.3021 | 15.0 | 465 | 2.9170 | 1.0 | 6.6573 | 1.4390 |
| 3.3021 | 16.0 | 496 | 2.8817 | 1.0 | 6.3254 | 1.5021 |
| 3.2109 | 17.0 | 527 | 2.8456 | 1.0 | 6.2254 | 1.5389 |
| 3.1266 | 18.0 | 558 | 2.8140 | 1.0 | 7.1893 | 1.6168 |
| 3.1266 | 19.0 | 589 | 2.7836 | 1.0 | 6.9004 | 1.7308 |
| 3.0775 | 20.0 | 620 | 2.7528 | 1.0 | 6.6448 | 1.7620 |
| 3.0026 | 21.0 | 651 | 2.7238 | 1.0 | 6.4260 | 1.8896 |
| 3.0026 | 22.0 | 682 | 2.7012 | 1.0 | 5.1268 | 2.0290 |
| 2.9563 | 23.0 | 713 | 2.6767 | 1.0 | 5.4232 | 2.1725 |
| 2.9563 | 24.0 | 744 | 2.6560 | 1.0 | 5.7317 | 2.2300 |
| 2.891 | 25.0 | 775 | 2.6368 | 1.0 | 5.3370 | 2.2402 |
| 2.8587 | 26.0 | 806 | 2.6135 | 1.0 | 5.3657 | 2.2324 |
| 2.8587 | 27.0 | 837 | 2.5934 | 1.0 | 5.2370 | 2.3208 |
| 2.8015 | 28.0 | 868 | 2.5734 | 1.0 | 5.3508 | 2.3066 |
| 2.8015 | 29.0 | 899 | 2.5603 | 1.0 | 6.2737 | 2.4315 |
| 2.7561 | 30.0 | 930 | 2.5431 | 1.0 | 6.4207 | 2.4338 |
| 2.7188 | 31.0 | 961 | 2.5317 | 1.0 | 6.1300 | 2.4899 |
| 2.7188 | 32.0 | 992 | 2.5151 | 1.0 | 5.5575 | 2.5777 |
| 2.6827 | 33.0 | 1023 | 2.5060 | 1.0 | 5.7487 | 2.5721 |
| 2.6416 | 34.0 | 1054 | 2.4929 | 1.0 | 5.5721 | 2.6315 |
| 2.6416 | 35.0 | 1085 | 2.4814 | 1.0 | 5.7915 | 2.6630 |
| 2.6138 | 36.0 | 1116 | 2.4674 | 1.0 | 6.1883 | 2.7386 |
| 2.6138 | 37.0 | 1147 | 2.4581 | 1.0 | 5.3096 | 2.6823 |
| 2.5766 | 38.0 | 1178 | 2.4477 | 1.0 | 5.3723 | 2.7470 |
| 2.5413 | 39.0 | 1209 | 2.4335 | 1.0 | 5.3362 | 2.7325 |
| 2.5413 | 40.0 | 1240 | 2.4254 | 1.0 | 6.3199 | 2.8268 |
| 2.5103 | 41.0 | 1271 | 2.4123 | 1.0 | 6.5643 | 2.8315 |
| 2.4832 | 42.0 | 1302 | 2.4081 | 1.0 | 6.0125 | 2.9006 |
| 2.4832 | 43.0 | 1333 | 2.3951 | 1.0 | 6.0619 | 2.9568 |
| 2.4585 | 44.0 | 1364 | 2.3889 | 1.0 | 5.9724 | 3.0480 |
| 2.4585 | 45.0 | 1395 | 2.3860 | 1.0 | 6.7165 | 3.0197 |
| 2.4241 | 46.0 | 1426 | 2.3754 | 1.0 | 6.4182 | 3.0169 |
| 2.3965 | 47.0 | 1457 | 2.3647 | 1.0 | 6.5957 | 3.0111 |
| 2.3965 | 48.0 | 1488 | 2.3574 | 1.0 | 5.8117 | 2.9843 |
| 2.3654 | 49.0 | 1519 | 2.3501 | 1.0 | 5.4900 | 3.0144 |
| 2.3593 | 50.0 | 1550 | 2.3415 | 1.0 | 6.6882 | 3.0177 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/d431bd02a55bbd85866f1939f367ddca
Base model
google-t5/t5-small