a55f5de94c289243ea7f5b22be930596
This model is a fine-tuned version of google-t5/t5-large on the Helsinki-NLP/opus_books [it-sv] dataset. It achieves the following results on the evaluation set:
- Loss: 2.0482
- Data Size: 1.0
- Epoch Runtime: 38.5429
- Bleu: 2.8624
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 4.1119 | 0 | 3.1193 | 0.2144 |
| No log | 1 | 74 | 3.9589 | 0.0078 | 3.7150 | 0.2346 |
| No log | 2 | 148 | 3.3321 | 0.0156 | 5.2219 | 0.3324 |
| 0.1352 | 3 | 222 | 3.1186 | 0.0312 | 7.9675 | 0.4616 |
| 0.1352 | 4 | 296 | 2.9632 | 0.0625 | 9.8806 | 0.7536 |
| 0.2201 | 5 | 370 | 2.8418 | 0.125 | 12.8331 | 0.9133 |
| 0.2201 | 6 | 444 | 2.7179 | 0.25 | 16.7080 | 1.0366 |
| 0.6174 | 7 | 518 | 2.5883 | 0.5 | 22.9964 | 0.7716 |
| 1.7466 | 8.0 | 592 | 2.4355 | 1.0 | 38.6028 | 1.1599 |
| 2.4978 | 9.0 | 666 | 2.3423 | 1.0 | 37.9237 | 1.3622 |
| 2.4076 | 10.0 | 740 | 2.2758 | 1.0 | 38.4024 | 1.5715 |
| 2.2888 | 11.0 | 814 | 2.2214 | 1.0 | 38.4786 | 1.7239 |
| 2.2112 | 12.0 | 888 | 2.1839 | 1.0 | 38.3051 | 1.8597 |
| 2.101 | 13.0 | 962 | 2.1455 | 1.0 | 36.1867 | 1.8599 |
| 2.0638 | 14.0 | 1036 | 2.1136 | 1.0 | 37.0464 | 2.0712 |
| 1.9577 | 15.0 | 1110 | 2.0953 | 1.0 | 39.8808 | 2.1943 |
| 1.9117 | 16.0 | 1184 | 2.0838 | 1.0 | 36.4446 | 2.1814 |
| 1.8403 | 17.0 | 1258 | 2.0642 | 1.0 | 39.0805 | 2.2443 |
| 1.7974 | 18.0 | 1332 | 2.0671 | 1.0 | 38.9135 | 2.3334 |
| 1.7339 | 19.0 | 1406 | 2.0612 | 1.0 | 37.1128 | 2.4690 |
| 1.6825 | 20.0 | 1480 | 2.0455 | 1.0 | 41.0984 | 2.4958 |
| 1.6272 | 21.0 | 1554 | 2.0384 | 1.0 | 38.0693 | 2.5956 |
| 1.5797 | 22.0 | 1628 | 2.0517 | 1.0 | 38.6636 | 2.6271 |
| 1.5208 | 23.0 | 1702 | 2.0343 | 1.0 | 37.4619 | 2.6566 |
| 1.4857 | 24.0 | 1776 | 2.0297 | 1.0 | 37.8110 | 2.6490 |
| 1.4415 | 25.0 | 1850 | 2.0412 | 1.0 | 37.4855 | 2.7629 |
| 1.4072 | 26.0 | 1924 | 2.0330 | 1.0 | 39.4528 | 2.7936 |
| 1.3842 | 27.0 | 1998 | 2.0452 | 1.0 | 39.0299 | 2.8494 |
| 1.3323 | 28.0 | 2072 | 2.0482 | 1.0 | 38.5429 | 2.8624 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/a55f5de94c289243ea7f5b22be930596
Base model
google-t5/t5-large