de73f10b4bc545baba6a0ac70767ea89
This model is a fine-tuned version of google-t5/t5-small on the Helsinki-NLP/opus_books dataset. It achieves the following results on the evaluation set:
- Loss: 1.3712
- Data Size: 1.0
- Epoch Runtime: 132.1769
- Bleu: 6.0814
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 4.6458 | 0 | 11.7817 | 0.1613 |
| No log | 1 | 1000 | 4.2997 | 0.0078 | 13.9839 | 0.2020 |
| No log | 2 | 2000 | 4.0161 | 0.0156 | 14.3451 | 0.2473 |
| No log | 3 | 3000 | 3.8014 | 0.0312 | 16.3804 | 0.2429 |
| 0.1455 | 4 | 4000 | 3.6216 | 0.0625 | 22.6181 | 0.3368 |
| 3.7679 | 5 | 5000 | 3.4273 | 0.125 | 30.8931 | 0.4199 |
| 0.2175 | 6 | 6000 | 3.1929 | 0.25 | 43.9417 | 0.5916 |
| 0.291 | 7 | 7000 | 2.9170 | 0.5 | 77.3792 | 1.1880 |
| 2.8856 | 8.0 | 8000 | 2.6146 | 1.0 | 143.7546 | 1.6428 |
| 2.6855 | 9.0 | 9000 | 2.4309 | 1.0 | 143.5087 | 1.9918 |
| 2.5364 | 10.0 | 10000 | 2.2938 | 1.0 | 144.2288 | 2.2682 |
| 2.4361 | 11.0 | 11000 | 2.1900 | 1.0 | 149.5659 | 2.5739 |
| 2.3464 | 12.0 | 12000 | 2.1049 | 1.0 | 142.6188 | 2.8246 |
| 2.2631 | 13.0 | 13000 | 2.0329 | 1.0 | 153.2539 | 3.0344 |
| 2.2136 | 14.0 | 14000 | 1.9732 | 1.0 | 155.4876 | 3.2309 |
| 2.1284 | 15.0 | 15000 | 1.9219 | 1.0 | 153.2130 | 3.3932 |
| 2.0955 | 16.0 | 16000 | 1.8782 | 1.0 | 138.6666 | 3.5720 |
| 2.0283 | 17.0 | 17000 | 1.8365 | 1.0 | 164.6555 | 3.7042 |
| 2.0186 | 18.0 | 18000 | 1.8006 | 1.0 | 138.7472 | 3.8593 |
| 1.969 | 19.0 | 19000 | 1.7677 | 1.0 | 139.7041 | 3.9740 |
| 1.9141 | 20.0 | 20000 | 1.7369 | 1.0 | 152.2548 | 4.1293 |
| 1.9024 | 21.0 | 21000 | 1.7068 | 1.0 | 133.5899 | 4.2375 |
| 1.8537 | 22.0 | 22000 | 1.6850 | 1.0 | 132.4430 | 4.3756 |
| 1.8305 | 23.0 | 23000 | 1.6603 | 1.0 | 134.8845 | 4.4725 |
| 1.7969 | 24.0 | 24000 | 1.6410 | 1.0 | 164.0537 | 4.5768 |
| 1.8096 | 25.0 | 25000 | 1.6214 | 1.0 | 137.1758 | 4.6656 |
| 1.7553 | 26.0 | 26000 | 1.6026 | 1.0 | 162.5463 | 4.7777 |
| 1.7342 | 27.0 | 27000 | 1.5850 | 1.0 | 132.8044 | 4.8680 |
| 1.6983 | 28.0 | 28000 | 1.5707 | 1.0 | 136.7322 | 4.9598 |
| 1.7111 | 29.0 | 29000 | 1.5545 | 1.0 | 146.3394 | 5.0253 |
| 1.6679 | 30.0 | 30000 | 1.5409 | 1.0 | 145.4068 | 5.0856 |
| 1.6672 | 31.0 | 31000 | 1.5253 | 1.0 | 149.6627 | 5.1691 |
| 1.6531 | 32.0 | 32000 | 1.5168 | 1.0 | 137.6426 | 5.2505 |
| 1.5978 | 33.0 | 33000 | 1.5047 | 1.0 | 138.8876 | 5.2911 |
| 1.5973 | 34.0 | 34000 | 1.4932 | 1.0 | 145.4915 | 5.3297 |
| 1.5642 | 35.0 | 35000 | 1.4814 | 1.0 | 128.1568 | 5.4291 |
| 1.5677 | 36.0 | 36000 | 1.4746 | 1.0 | 133.6715 | 5.4750 |
| 1.5557 | 37.0 | 37000 | 1.4607 | 1.0 | 128.2057 | 5.5561 |
| 1.5574 | 38.0 | 38000 | 1.4525 | 1.0 | 124.6804 | 5.5801 |
| 1.5186 | 39.0 | 39000 | 1.4487 | 1.0 | 127.0441 | 5.6308 |
| 1.5115 | 40.0 | 40000 | 1.4393 | 1.0 | 134.0554 | 5.6973 |
| 1.5055 | 41.0 | 41000 | 1.4278 | 1.0 | 128.4823 | 5.7426 |
| 1.4933 | 42.0 | 42000 | 1.4191 | 1.0 | 130.1083 | 5.7839 |
| 1.4835 | 43.0 | 43000 | 1.4152 | 1.0 | 127.8246 | 5.8283 |
| 1.4572 | 44.0 | 44000 | 1.4068 | 1.0 | 128.7732 | 5.8464 |
| 1.4554 | 45.0 | 45000 | 1.4028 | 1.0 | 133.7048 | 5.9022 |
| 1.4694 | 46.0 | 46000 | 1.3929 | 1.0 | 136.7006 | 5.9434 |
| 1.4448 | 47.0 | 47000 | 1.3852 | 1.0 | 131.0279 | 5.9637 |
| 1.4448 | 48.0 | 48000 | 1.3817 | 1.0 | 130.6594 | 6.0077 |
| 1.402 | 49.0 | 49000 | 1.3777 | 1.0 | 133.9181 | 6.0371 |
| 1.4073 | 50.0 | 50000 | 1.3712 | 1.0 | 132.1769 | 6.0814 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/de73f10b4bc545baba6a0ac70767ea89
Base model
google-t5/t5-small