a1a6aff4ce2d90f17a750fdcaf097651
This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books [de-pt] dataset. It achieves the following results on the evaluation set:
- Loss: 1.9747
- Data Size: 1.0
- Epoch Runtime: 10.4799
- Bleu: 5.6590
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 3.5795 | 0 | 1.5110 | 0.8069 |
| No log | 1 | 27 | 3.5288 | 0.0078 | 2.0713 | 0.8193 |
| No log | 2 | 54 | 3.4226 | 0.0156 | 1.8171 | 0.8765 |
| No log | 3 | 81 | 3.3137 | 0.0312 | 2.1961 | 0.8726 |
| No log | 4 | 108 | 3.2241 | 0.0625 | 2.6187 | 0.9015 |
| No log | 5 | 135 | 3.0794 | 0.125 | 3.5629 | 0.9034 |
| No log | 6 | 162 | 2.8764 | 0.25 | 4.5319 | 1.6239 |
| No log | 7 | 189 | 2.6789 | 0.5 | 7.1745 | 1.8214 |
| 0.6673 | 8.0 | 216 | 2.5209 | 1.0 | 11.8647 | 2.1250 |
| 0.6673 | 9.0 | 243 | 2.4311 | 1.0 | 12.2110 | 2.9570 |
| 2.7346 | 10.0 | 270 | 2.3574 | 1.0 | 11.9172 | 3.0805 |
| 2.7346 | 11.0 | 297 | 2.3070 | 1.0 | 13.7312 | 3.3673 |
| 2.5374 | 12.0 | 324 | 2.2623 | 1.0 | 12.1588 | 3.6163 |
| 2.3816 | 13.0 | 351 | 2.2264 | 1.0 | 12.0713 | 3.4099 |
| 2.3816 | 14.0 | 378 | 2.1950 | 1.0 | 10.7154 | 3.7960 |
| 2.2706 | 15.0 | 405 | 2.1645 | 1.0 | 9.9418 | 4.1388 |
| 2.2706 | 16.0 | 432 | 2.1396 | 1.0 | 11.5808 | 4.2158 |
| 2.1735 | 17.0 | 459 | 2.1212 | 1.0 | 12.5215 | 4.3524 |
| 2.1735 | 18.0 | 486 | 2.1062 | 1.0 | 12.6577 | 4.3225 |
| 2.0687 | 19.0 | 513 | 2.0859 | 1.0 | 13.0955 | 4.5394 |
| 2.0687 | 20.0 | 540 | 2.0751 | 1.0 | 14.9231 | 4.4761 |
| 1.9922 | 21.0 | 567 | 2.0550 | 1.0 | 11.4518 | 4.8162 |
| 1.9922 | 22.0 | 594 | 2.0431 | 1.0 | 13.6035 | 4.8632 |
| 1.9275 | 23.0 | 621 | 2.0416 | 1.0 | 11.6460 | 4.6598 |
| 1.9275 | 24.0 | 648 | 2.0323 | 1.0 | 12.5673 | 4.7531 |
| 1.8617 | 25.0 | 675 | 2.0227 | 1.0 | 9.9415 | 4.9116 |
| 1.8078 | 26.0 | 702 | 2.0161 | 1.0 | 9.5311 | 4.7579 |
| 1.8078 | 27.0 | 729 | 2.0159 | 1.0 | 11.9414 | 4.7400 |
| 1.7452 | 28.0 | 756 | 1.9923 | 1.0 | 11.1788 | 5.1620 |
| 1.7452 | 29.0 | 783 | 1.9901 | 1.0 | 11.7848 | 5.1908 |
| 1.6985 | 30.0 | 810 | 1.9933 | 1.0 | 11.6818 | 5.2161 |
| 1.6985 | 31.0 | 837 | 1.9735 | 1.0 | 11.0877 | 5.2446 |
| 1.6343 | 32.0 | 864 | 1.9759 | 1.0 | 12.3549 | 5.2742 |
| 1.6343 | 33.0 | 891 | 1.9756 | 1.0 | 13.1103 | 5.0968 |
| 1.5958 | 34.0 | 918 | 1.9697 | 1.0 | 11.9184 | 5.3222 |
| 1.5958 | 35.0 | 945 | 1.9672 | 1.0 | 12.3234 | 5.3265 |
| 1.5497 | 36.0 | 972 | 1.9727 | 1.0 | 12.7994 | 5.2308 |
| 1.5497 | 37.0 | 999 | 1.9713 | 1.0 | 10.8372 | 5.2037 |
| 1.5086 | 38.0 | 1026 | 1.9620 | 1.0 | 11.8187 | 5.2829 |
| 1.4591 | 39.0 | 1053 | 1.9762 | 1.0 | 12.3488 | 5.2213 |
| 1.4591 | 40.0 | 1080 | 1.9759 | 1.0 | 12.8681 | 5.4821 |
| 1.4181 | 41.0 | 1107 | 1.9682 | 1.0 | 11.7268 | 5.5189 |
| 1.4181 | 42.0 | 1134 | 1.9747 | 1.0 | 10.4799 | 5.6590 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/a1a6aff4ce2d90f17a750fdcaf097651
Base model
google-t5/t5-base