cdbd9519200627023626eaf4f5a9c2ec
This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books [de-en] dataset. It achieves the following results on the evaluation set:
- Loss: 1.8382
- Data Size: 1.0
- Epoch Runtime: 296.1652
- Bleu: 10.3689
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 3.2834 | 0 | 32.3240 | 3.1665 |
| No log | 1 | 1286 | 2.8676 | 0.0078 | 31.2946 | 4.9302 |
| 0.061 | 2 | 2572 | 2.6478 | 0.0156 | 39.1063 | 6.0021 |
| 0.0781 | 3 | 3858 | 2.5222 | 0.0312 | 40.5774 | 6.1777 |
| 0.1162 | 4 | 5144 | 2.4145 | 0.0625 | 44.6000 | 5.9555 |
| 2.5374 | 5 | 6430 | 2.3278 | 0.125 | 71.0763 | 6.6924 |
| 2.4421 | 6 | 7716 | 2.2364 | 0.25 | 114.0874 | 6.9143 |
| 2.3398 | 7 | 9002 | 2.1406 | 0.5 | 203.1443 | 7.6527 |
| 2.2267 | 8.0 | 10288 | 2.0375 | 1.0 | 320.0082 | 8.2874 |
| 2.1245 | 9.0 | 11574 | 1.9762 | 1.0 | 327.2068 | 8.7267 |
| 2.0126 | 10.0 | 12860 | 1.9342 | 1.0 | 318.2429 | 9.0232 |
| 1.9288 | 11.0 | 14146 | 1.9076 | 1.0 | 313.7659 | 9.4612 |
| 1.9147 | 12.0 | 15432 | 1.8856 | 1.0 | 306.3946 | 9.5535 |
| 1.8118 | 13.0 | 16718 | 1.8670 | 1.0 | 301.9939 | 9.6642 |
| 1.7869 | 14.0 | 18004 | 1.8577 | 1.0 | 303.7155 | 9.7512 |
| 1.7684 | 15.0 | 19290 | 1.8451 | 1.0 | 311.2393 | 9.8912 |
| 1.7035 | 16.0 | 20576 | 1.8364 | 1.0 | 301.0892 | 10.0150 |
| 1.6895 | 17.0 | 21862 | 1.8324 | 1.0 | 312.8599 | 10.0120 |
| 1.6357 | 18.0 | 23148 | 1.8328 | 1.0 | 298.5094 | 10.0325 |
| 1.615 | 19.0 | 24434 | 1.8274 | 1.0 | 303.1685 | 10.1694 |
| 1.5417 | 20.0 | 25720 | 1.8263 | 1.0 | 301.3516 | 10.1965 |
| 1.5421 | 21.0 | 27006 | 1.8260 | 1.0 | 308.1764 | 10.1196 |
| 1.5266 | 22.0 | 28292 | 1.8275 | 1.0 | 300.0012 | 10.2606 |
| 1.4999 | 23.0 | 29578 | 1.8345 | 1.0 | 303.9673 | 10.3856 |
| 1.4494 | 24.0 | 30864 | 1.8391 | 1.0 | 291.5877 | 10.4315 |
| 1.4225 | 25.0 | 32150 | 1.8382 | 1.0 | 296.1652 | 10.3689 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/cdbd9519200627023626eaf4f5a9c2ec
Base model
google-t5/t5-base