038d5148187e070b975e9031342ef73a
This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [fr-pt] dataset. It achieves the following results on the evaluation set:
- Loss: 4.5345
- Data Size: 1.0
- Epoch Runtime: 20.1158
- Bleu: 0.7389
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 236.7255 | 0 | 1.7678 | 0.0204 |
| No log | 1 | 31 | 219.1461 | 0.0078 | 2.4638 | 0.0188 |
| No log | 2 | 62 | 201.9752 | 0.0156 | 3.1682 | 0.0124 |
| No log | 3 | 93 | 187.6383 | 0.0312 | 4.4594 | 0.0122 |
| No log | 4 | 124 | 163.7096 | 0.0625 | 6.4885 | 0.0215 |
| No log | 5 | 155 | 130.6493 | 0.125 | 8.8677 | 0.0198 |
| No log | 6 | 186 | 79.7999 | 0.25 | 11.0911 | 0.0040 |
| 19.2232 | 7 | 217 | 37.5892 | 0.5 | 15.1334 | 0.0052 |
| 19.2232 | 8.0 | 248 | 17.3204 | 1.0 | 21.9644 | 0.0283 |
| 27.1063 | 9.0 | 279 | 14.9082 | 1.0 | 20.2933 | 0.2114 |
| 23.9019 | 10.0 | 310 | 12.7926 | 1.0 | 20.2024 | 0.2739 |
| 23.9019 | 11.0 | 341 | 11.3982 | 1.0 | 19.4193 | 0.2784 |
| 18.8143 | 12.0 | 372 | 10.3658 | 1.0 | 19.5572 | 0.0558 |
| 16.1903 | 13.0 | 403 | 9.2900 | 1.0 | 20.0680 | 0.1722 |
| 16.1903 | 14.0 | 434 | 9.6809 | 1.0 | 20.0169 | 0.0760 |
| 14.6433 | 15.0 | 465 | 8.9208 | 1.0 | 19.4845 | 0.0727 |
| 14.6433 | 16.0 | 496 | 8.0822 | 1.0 | 19.7651 | 0.0400 |
| 13.4211 | 17.0 | 527 | 8.3107 | 1.0 | 19.3867 | 0.0532 |
| 12.4357 | 18.0 | 558 | 8.0146 | 1.0 | 19.9804 | 0.0798 |
| 12.4357 | 19.0 | 589 | 7.4550 | 1.0 | 20.1282 | 0.2350 |
| 11.5437 | 20.0 | 620 | 7.0633 | 1.0 | 20.3499 | 0.3080 |
| 10.7571 | 21.0 | 651 | 6.6771 | 1.0 | 19.6223 | 0.2713 |
| 10.7571 | 22.0 | 682 | 6.4946 | 1.0 | 19.9972 | 0.3405 |
| 10.1704 | 23.0 | 713 | 6.5607 | 1.0 | 19.7047 | 0.3599 |
| 10.1704 | 24.0 | 744 | 6.3228 | 1.0 | 20.5115 | 0.4006 |
| 9.5748 | 25.0 | 775 | 6.2065 | 1.0 | 20.2679 | 0.4567 |
| 9.1474 | 26.0 | 806 | 6.1813 | 1.0 | 19.4314 | 0.2706 |
| 9.1474 | 27.0 | 837 | 6.0840 | 1.0 | 19.2637 | 0.3095 |
| 8.7856 | 28.0 | 868 | 5.8688 | 1.0 | 19.8345 | 0.4495 |
| 8.7856 | 29.0 | 899 | 5.6269 | 1.0 | 19.7177 | 0.5039 |
| 8.3737 | 30.0 | 930 | 5.6042 | 1.0 | 19.7356 | 0.5152 |
| 8.0599 | 31.0 | 961 | 5.6617 | 1.0 | 19.7495 | 0.4316 |
| 8.0599 | 32.0 | 992 | 5.7713 | 1.0 | 19.9040 | 0.5404 |
| 7.809 | 33.0 | 1023 | 5.4166 | 1.0 | 20.1333 | 0.5372 |
| 7.4953 | 34.0 | 1054 | 5.3529 | 1.0 | 19.5804 | 0.5465 |
| 7.4953 | 35.0 | 1085 | 5.4388 | 1.0 | 19.7150 | 0.5503 |
| 7.1911 | 36.0 | 1116 | 5.0335 | 1.0 | 19.7163 | 0.5790 |
| 7.1911 | 37.0 | 1147 | 5.1159 | 1.0 | 19.9428 | 0.6223 |
| 6.9716 | 38.0 | 1178 | 5.0591 | 1.0 | 20.3290 | 0.4831 |
| 6.7211 | 39.0 | 1209 | 4.8556 | 1.0 | 20.0206 | 0.6963 |
| 6.7211 | 40.0 | 1240 | 5.0075 | 1.0 | 20.1885 | 0.6027 |
| 6.5236 | 41.0 | 1271 | 4.7169 | 1.0 | 21.1392 | 0.7565 |
| 6.3467 | 42.0 | 1302 | 4.7961 | 1.0 | 20.7119 | 0.6435 |
| 6.3467 | 43.0 | 1333 | 4.6043 | 1.0 | 19.7915 | 0.6827 |
| 6.1671 | 44.0 | 1364 | 4.5811 | 1.0 | 19.8226 | 0.7857 |
| 6.1671 | 45.0 | 1395 | 4.7868 | 1.0 | 19.8787 | 0.6117 |
| 6.0261 | 46.0 | 1426 | 4.5726 | 1.0 | 20.7483 | 0.6840 |
| 5.8656 | 47.0 | 1457 | 4.4508 | 1.0 | 20.1728 | 0.7430 |
| 5.8656 | 48.0 | 1488 | 4.4236 | 1.0 | 19.9059 | 0.8194 |
| 5.6751 | 49.0 | 1519 | 4.4156 | 1.0 | 20.3834 | 0.7308 |
| 5.5485 | 50.0 | 1550 | 4.5345 | 1.0 | 20.1158 | 0.7389 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/038d5148187e070b975e9031342ef73a
Base model
google/long-t5-local-large