decafd18d2dee1b3cd3d3a28a6635305
This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books dataset. It achieves the following results on the evaluation set:
- Loss: 0.9814
- Data Size: 1.0
- Epoch Runtime: 98.1355
- Bleu: 13.3563
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 2.5607 | 0 | 14.5991 | 0.1842 |
| No log | 1 | 437 | 2.4747 | 0.0078 | 16.6316 | 0.2266 |
| No log | 2 | 874 | 2.3841 | 0.0156 | 12.4275 | 0.4988 |
| No log | 3 | 1311 | 2.2488 | 0.0312 | 17.1922 | 0.5118 |
| No log | 4 | 1748 | 2.1029 | 0.0625 | 19.2542 | 0.7667 |
| 2.2109 | 5 | 2185 | 1.9879 | 0.125 | 26.4880 | 1.2838 |
| 2.1006 | 6 | 2622 | 1.8642 | 0.25 | 42.0241 | 2.6669 |
| 1.9291 | 7 | 3059 | 1.7206 | 0.5 | 70.2491 | 4.0708 |
| 1.7381 | 8.0 | 3496 | 1.5636 | 1.0 | 149.8675 | 5.5590 |
| 1.6181 | 9.0 | 3933 | 1.4499 | 1.0 | 130.7932 | 6.6572 |
| 1.5056 | 10.0 | 4370 | 1.3723 | 1.0 | 145.4794 | 7.4242 |
| 1.447 | 11.0 | 4807 | 1.3099 | 1.0 | 145.7462 | 8.0778 |
| 1.383 | 12.0 | 5244 | 1.2594 | 1.0 | 108.3039 | 8.8326 |
| 1.3299 | 13.0 | 5681 | 1.2202 | 1.0 | 115.9984 | 9.1728 |
| 1.2613 | 14.0 | 6118 | 1.1907 | 1.0 | 104.9783 | 9.7938 |
| 1.2273 | 15.0 | 6555 | 1.1621 | 1.0 | 102.6652 | 9.7652 |
| 1.1678 | 16.0 | 6992 | 1.1362 | 1.0 | 100.6030 | 10.2138 |
| 1.1441 | 17.0 | 7429 | 1.1101 | 1.0 | 105.9486 | 10.3005 |
| 1.1065 | 18.0 | 7866 | 1.0993 | 1.0 | 106.0890 | 11.0051 |
| 1.0658 | 19.0 | 8303 | 1.0737 | 1.0 | 104.2994 | 10.9763 |
| 1.0488 | 20.0 | 8740 | 1.0597 | 1.0 | 101.6413 | 11.4335 |
| 1.0181 | 21.0 | 9177 | 1.0459 | 1.0 | 101.3497 | 11.7183 |
| 0.9851 | 22.0 | 9614 | 1.0368 | 1.0 | 101.2914 | 11.7005 |
| 0.963 | 23.0 | 10051 | 1.0302 | 1.0 | 111.7042 | 11.8514 |
| 0.9359 | 24.0 | 10488 | 1.0192 | 1.0 | 111.2087 | 12.1365 |
| 0.9233 | 25.0 | 10925 | 1.0095 | 1.0 | 105.1604 | 12.3197 |
| 0.9022 | 26.0 | 11362 | 1.0089 | 1.0 | 104.0405 | 12.3111 |
| 0.8816 | 27.0 | 11799 | 1.0057 | 1.0 | 104.8092 | 12.5789 |
| 0.8525 | 28.0 | 12236 | 0.9977 | 1.0 | 106.6144 | 12.4490 |
| 0.8293 | 29.0 | 12673 | 0.9884 | 1.0 | 98.7457 | 12.7247 |
| 0.8088 | 30.0 | 13110 | 0.9838 | 1.0 | 100.4866 | 12.8502 |
| 0.8072 | 31.0 | 13547 | 0.9825 | 1.0 | 98.6900 | 12.8563 |
| 0.7987 | 32.0 | 13984 | 0.9789 | 1.0 | 99.3213 | 12.8370 |
| 0.7723 | 33.0 | 14421 | 0.9761 | 1.0 | 98.1782 | 12.9031 |
| 0.7595 | 34.0 | 14858 | 0.9773 | 1.0 | 97.5901 | 13.0862 |
| 0.7414 | 35.0 | 15295 | 0.9782 | 1.0 | 102.2905 | 13.1639 |
| 0.7138 | 36.0 | 15732 | 0.9729 | 1.0 | 96.9078 | 13.5037 |
| 0.7123 | 37.0 | 16169 | 0.9727 | 1.0 | 97.9281 | 13.1368 |
| 0.6947 | 38.0 | 16606 | 0.9816 | 1.0 | 101.6442 | 13.5031 |
| 0.6734 | 39.0 | 17043 | 0.9750 | 1.0 | 99.8666 | 13.1697 |
| 0.664 | 40.0 | 17480 | 0.9799 | 1.0 | 98.9188 | 13.2457 |
| 0.6697 | 41.0 | 17917 | 0.9814 | 1.0 | 98.1355 | 13.3563 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/decafd18d2dee1b3cd3d3a28a6635305
Base model
google-t5/t5-base