69e88b75427c5b57da0bc9096c70bc6b
This model is a fine-tuned version of google/long-t5-tglobal-xl on the Helsinki-NLP/opus_books [fi-no] dataset. It achieves the following results on the evaluation set:
- Loss: 1.9816
- Data Size: 1.0
- Epoch Runtime: 62.9782
- Bleu: 1.1876
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 3.5679 | 0 | 4.3721 | 0.0584 |
| No log | 1 | 85 | 3.0375 | 0.0078 | 5.3873 | 0.0375 |
| No log | 2 | 170 | 2.7687 | 0.0156 | 9.2871 | 0.0840 |
| No log | 3 | 255 | 2.6347 | 0.0312 | 15.3733 | 0.1054 |
| No log | 4 | 340 | 2.5443 | 0.0625 | 20.3571 | 0.1102 |
| 0.1993 | 5 | 425 | 2.4398 | 0.125 | 23.7270 | 0.2683 |
| 0.1993 | 6 | 510 | 2.3298 | 0.25 | 27.7623 | 0.3713 |
| 0.8408 | 7 | 595 | 2.2112 | 0.5 | 41.8405 | 0.5285 |
| 2.4407 | 8.0 | 680 | 2.1158 | 1.0 | 63.3788 | 0.6352 |
| 2.2352 | 9.0 | 765 | 2.0510 | 1.0 | 62.6046 | 0.7461 |
| 2.1121 | 10.0 | 850 | 2.0068 | 1.0 | 58.5871 | 0.8303 |
| 2.0228 | 11.0 | 935 | 1.9786 | 1.0 | 61.3216 | 0.8817 |
| 1.9092 | 12.0 | 1020 | 1.9576 | 1.0 | 56.9923 | 1.0020 |
| 1.8284 | 13.0 | 1105 | 1.9445 | 1.0 | 60.1908 | 1.0769 |
| 1.7448 | 14.0 | 1190 | 1.9276 | 1.0 | 62.9548 | 1.0535 |
| 1.6732 | 15.0 | 1275 | 1.9263 | 1.0 | 58.6772 | 1.1010 |
| 1.5876 | 16.0 | 1360 | 1.9323 | 1.0 | 62.0639 | 1.1050 |
| 1.4991 | 17.0 | 1445 | 1.9479 | 1.0 | 59.8392 | 1.1192 |
| 1.4189 | 18.0 | 1530 | 1.9636 | 1.0 | 59.2265 | 1.1803 |
| 1.3422 | 19.0 | 1615 | 1.9816 | 1.0 | 62.9782 | 1.1876 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/69e88b75427c5b57da0bc9096c70bc6b
Base model
google/long-t5-tglobal-xl