f5d3ae17b1930dbf87f1b7283f9ed30c
This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books dataset. It achieves the following results on the evaluation set:
- Loss: 1.2291
- Data Size: 1.0
- Epoch Runtime: 15.0556
- Bleu: 14.7465
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 2.9686 | 0 | 1.7897 | 1.7611 |
| No log | 1 | 35 | 2.9335 | 0.0078 | 2.3796 | 1.8142 |
| No log | 2 | 70 | 2.8721 | 0.0156 | 2.0602 | 1.9906 |
| No log | 3 | 105 | 2.7471 | 0.0312 | 2.8569 | 2.2344 |
| No log | 4 | 140 | 2.6188 | 0.0625 | 3.2800 | 2.4158 |
| No log | 5 | 175 | 2.4610 | 0.125 | 4.1849 | 2.7923 |
| No log | 6 | 210 | 2.2644 | 0.25 | 5.1989 | 3.8902 |
| No log | 7 | 245 | 2.0473 | 0.5 | 8.8507 | 4.9557 |
| 0.4917 | 8.0 | 280 | 1.8653 | 1.0 | 14.1202 | 6.9420 |
| 2.2497 | 9.0 | 315 | 1.7499 | 1.0 | 13.9608 | 7.6192 |
| 2.0342 | 10.0 | 350 | 1.6686 | 1.0 | 12.5722 | 8.9947 |
| 2.0342 | 11.0 | 385 | 1.5980 | 1.0 | 13.6303 | 9.2455 |
| 1.8555 | 12.0 | 420 | 1.5371 | 1.0 | 13.5858 | 10.1462 |
| 1.7107 | 13.0 | 455 | 1.4986 | 1.0 | 10.9217 | 10.5372 |
| 1.7107 | 14.0 | 490 | 1.4559 | 1.0 | 10.9373 | 9.8687 |
| 1.5995 | 15.0 | 525 | 1.4161 | 1.0 | 10.3431 | 10.7604 |
| 1.53 | 16.0 | 560 | 1.3893 | 1.0 | 10.7269 | 11.3155 |
| 1.53 | 17.0 | 595 | 1.3668 | 1.0 | 12.1763 | 11.5311 |
| 1.4222 | 18.0 | 630 | 1.3408 | 1.0 | 11.1285 | 11.4685 |
| 1.3365 | 19.0 | 665 | 1.3257 | 1.0 | 13.1371 | 12.1346 |
| 1.283 | 20.0 | 700 | 1.3018 | 1.0 | 13.3854 | 12.1569 |
| 1.283 | 21.0 | 735 | 1.2920 | 1.0 | 14.0139 | 12.5417 |
| 1.2155 | 22.0 | 770 | 1.2809 | 1.0 | 13.5094 | 12.6887 |
| 1.1635 | 23.0 | 805 | 1.2769 | 1.0 | 11.9000 | 13.0757 |
| 1.1635 | 24.0 | 840 | 1.2598 | 1.0 | 14.3637 | 13.3410 |
| 1.0967 | 25.0 | 875 | 1.2532 | 1.0 | 14.3873 | 13.5607 |
| 1.0632 | 26.0 | 910 | 1.2479 | 1.0 | 14.0800 | 13.8649 |
| 1.0632 | 27.0 | 945 | 1.2404 | 1.0 | 14.7500 | 13.7716 |
| 1.0084 | 28.0 | 980 | 1.2475 | 1.0 | 13.1077 | 13.9105 |
| 0.9661 | 29.0 | 1015 | 1.2323 | 1.0 | 15.0026 | 13.9146 |
| 0.9381 | 30.0 | 1050 | 1.2244 | 1.0 | 14.5107 | 14.0155 |
| 0.9381 | 31.0 | 1085 | 1.2323 | 1.0 | 14.6993 | 14.1804 |
| 0.8814 | 32.0 | 1120 | 1.2260 | 1.0 | 13.6416 | 14.1333 |
| 0.8562 | 33.0 | 1155 | 1.2231 | 1.0 | 13.8906 | 14.8269 |
| 0.8562 | 34.0 | 1190 | 1.2253 | 1.0 | 15.0576 | 14.2483 |
| 0.8088 | 35.0 | 1225 | 1.2243 | 1.0 | 13.6468 | 14.5773 |
| 0.7953 | 36.0 | 1260 | 1.2268 | 1.0 | 13.1516 | 14.5130 |
| 0.7953 | 37.0 | 1295 | 1.2291 | 1.0 | 15.0556 | 14.7465 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/f5d3ae17b1930dbf87f1b7283f9ed30c
Base model
google-t5/t5-base