112fec9ed42e51060f900d8f894eb446
This model is a fine-tuned version of google-t5/t5-large on the Helsinki-NLP/opus_books [de-pt] dataset. It achieves the following results on the evaluation set:
- Loss: 1.6703
- Data Size: 1.0
- Epoch Runtime: 21.2101
- Bleu: 7.4410
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 2.9596 | 0 | 1.7629 | 1.0230 |
| No log | 1 | 27 | 2.8965 | 0.0078 | 2.3567 | 1.1778 |
| No log | 2 | 54 | 2.7601 | 0.0156 | 3.8914 | 1.4456 |
| No log | 3 | 81 | 2.6535 | 0.0312 | 5.1172 | 1.7848 |
| No log | 4 | 108 | 2.5314 | 0.0625 | 6.3784 | 1.8423 |
| No log | 5 | 135 | 2.3265 | 0.125 | 8.5023 | 2.5052 |
| No log | 6 | 162 | 2.1458 | 0.25 | 11.4743 | 3.1025 |
| No log | 7 | 189 | 2.0311 | 0.5 | 15.1548 | 3.1577 |
| 0.5013 | 8.0 | 216 | 1.9043 | 1.0 | 20.2497 | 4.2677 |
| 0.5013 | 9.0 | 243 | 1.8183 | 1.0 | 18.0056 | 5.2331 |
| 1.9772 | 10.0 | 270 | 1.7607 | 1.0 | 18.4100 | 5.7080 |
| 1.9772 | 11.0 | 297 | 1.7316 | 1.0 | 19.6296 | 5.9091 |
| 1.7436 | 12.0 | 324 | 1.7052 | 1.0 | 18.8718 | 6.4142 |
| 1.5745 | 13.0 | 351 | 1.6868 | 1.0 | 19.1276 | 6.2425 |
| 1.5745 | 14.0 | 378 | 1.6810 | 1.0 | 19.4471 | 6.6954 |
| 1.4445 | 15.0 | 405 | 1.6673 | 1.0 | 19.4834 | 6.7889 |
| 1.4445 | 16.0 | 432 | 1.6606 | 1.0 | 20.6860 | 6.6549 |
| 1.3433 | 17.0 | 459 | 1.6665 | 1.0 | 22.0664 | 6.6621 |
| 1.3433 | 18.0 | 486 | 1.6586 | 1.0 | 20.9229 | 6.8744 |
| 1.2301 | 19.0 | 513 | 1.6651 | 1.0 | 21.5009 | 7.0757 |
| 1.2301 | 20.0 | 540 | 1.6635 | 1.0 | 21.4612 | 6.7474 |
| 1.1481 | 21.0 | 567 | 1.6693 | 1.0 | 20.9774 | 7.3536 |
| 1.1481 | 22.0 | 594 | 1.6703 | 1.0 | 21.2101 | 7.4410 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/112fec9ed42e51060f900d8f894eb446
Base model
google-t5/t5-large