2ceaea9e502b528e9538edb1cb9746aa
This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [it-sv] dataset. It achieves the following results on the evaluation set:
- Loss: 3.1414
- Data Size: 1.0
- Epoch Runtime: 38.1692
- Bleu: 0.2150
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 221.5325 | 0 | 3.0682 | 0.0017 |
| No log | 1 | 74 | 198.9404 | 0.0078 | 4.5653 | 0.0030 |
| No log | 2 | 148 | 168.5888 | 0.0156 | 5.5722 | 0.0038 |
| 6.7009 | 3 | 222 | 143.3611 | 0.0312 | 7.2210 | 0.0039 |
| 6.7009 | 4 | 296 | 113.6552 | 0.0625 | 9.3702 | 0.0033 |
| 9.5048 | 5 | 370 | 64.5368 | 0.125 | 12.2335 | 0.0015 |
| 9.5048 | 6 | 444 | 28.8386 | 0.25 | 16.2880 | 0.0032 |
| 13.9289 | 7 | 518 | 16.8321 | 0.5 | 23.9353 | 0.0021 |
| 16.6125 | 8.0 | 592 | 11.4163 | 1.0 | 39.7476 | 0.0030 |
| 16.3322 | 9.0 | 666 | 9.6635 | 1.0 | 37.4544 | 0.0030 |
| 14.8335 | 10.0 | 740 | 9.1268 | 1.0 | 37.3580 | 0.0034 |
| 12.4568 | 11.0 | 814 | 8.0397 | 1.0 | 37.2919 | 0.0093 |
| 11.6423 | 12.0 | 888 | 7.3618 | 1.0 | 37.3146 | 0.0065 |
| 10.6014 | 13.0 | 962 | 7.2995 | 1.0 | 37.3232 | 0.0111 |
| 10.1262 | 14.0 | 1036 | 6.5569 | 1.0 | 37.7075 | 0.0083 |
| 9.2616 | 15.0 | 1110 | 6.1059 | 1.0 | 37.4836 | 0.0434 |
| 8.889 | 16.0 | 1184 | 5.7652 | 1.0 | 37.4712 | 0.0298 |
| 8.3487 | 17.0 | 1258 | 5.4665 | 1.0 | 37.5588 | 0.0884 |
| 8.0774 | 18.0 | 1332 | 5.2253 | 1.0 | 37.0616 | 0.0377 |
| 7.5902 | 19.0 | 1406 | 5.3379 | 1.0 | 37.2366 | 0.0450 |
| 7.404 | 20.0 | 1480 | 5.1511 | 1.0 | 37.1286 | 0.0331 |
| 7.0 | 21.0 | 1554 | 4.8344 | 1.0 | 37.0656 | 0.0798 |
| 6.8283 | 22.0 | 1628 | 4.8185 | 1.0 | 37.7522 | 0.0704 |
| 6.5355 | 23.0 | 1702 | 4.6294 | 1.0 | 37.0545 | 0.0906 |
| 6.4027 | 24.0 | 1776 | 4.4492 | 1.0 | 36.9281 | 0.0587 |
| 6.0659 | 25.0 | 1850 | 4.3216 | 1.0 | 37.2367 | 0.1275 |
| 5.9587 | 26.0 | 1924 | 4.2734 | 1.0 | 37.1520 | 0.1169 |
| 5.8082 | 27.0 | 1998 | 4.1049 | 1.0 | 37.4815 | 0.1064 |
| 5.6503 | 28.0 | 2072 | 4.0866 | 1.0 | 37.3501 | 0.1118 |
| 5.4901 | 29.0 | 2146 | 4.0389 | 1.0 | 37.5989 | 0.0751 |
| 5.342 | 30.0 | 2220 | 4.0093 | 1.0 | 37.1668 | 0.1192 |
| 5.2011 | 31.0 | 2294 | 3.8163 | 1.0 | 37.9941 | 0.1822 |
| 5.1137 | 32.0 | 2368 | 3.7344 | 1.0 | 38.1650 | 0.1568 |
| 5.0031 | 33.0 | 2442 | 3.8307 | 1.0 | 37.3691 | 0.0590 |
| 4.8702 | 34.0 | 2516 | 3.6911 | 1.0 | 37.8558 | 0.1390 |
| 4.8229 | 35.0 | 2590 | 3.7435 | 1.0 | 37.3599 | 0.1256 |
| 4.6841 | 36.0 | 2664 | 3.6212 | 1.0 | 38.0017 | 0.0937 |
| 4.607 | 37.0 | 2738 | 3.5849 | 1.0 | 37.0052 | 0.0807 |
| 4.4539 | 38.0 | 2812 | 3.6441 | 1.0 | 36.9756 | 0.0517 |
| 4.446 | 39.0 | 2886 | 3.5075 | 1.0 | 38.1396 | 0.0910 |
| 4.3089 | 40.0 | 2960 | 3.5159 | 1.0 | 36.8978 | 0.1166 |
| 4.2739 | 41.0 | 3034 | 3.3696 | 1.0 | 37.1386 | 0.1391 |
| 4.1601 | 42.0 | 3108 | 3.4174 | 1.0 | 38.0813 | 0.0957 |
| 4.0882 | 43.0 | 3182 | 3.3701 | 1.0 | 37.5271 | 0.1303 |
| 4.0148 | 44.0 | 3256 | 3.2427 | 1.0 | 37.6244 | 0.1631 |
| 3.9753 | 45.0 | 3330 | 3.2444 | 1.0 | 37.7280 | 0.1849 |
| 3.9001 | 46.0 | 3404 | 3.3593 | 1.0 | 37.2999 | 0.1094 |
| 3.8799 | 47.0 | 3478 | 3.2341 | 1.0 | 37.1080 | 0.1838 |
| 3.7997 | 48.0 | 3552 | 3.1679 | 1.0 | 37.5732 | 0.2302 |
| 3.7542 | 49.0 | 3626 | 3.1729 | 1.0 | 37.7116 | 0.1509 |
| 3.686 | 50.0 | 3700 | 3.1414 | 1.0 | 38.1692 | 0.2150 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/2ceaea9e502b528e9538edb1cb9746aa
Base model
google/long-t5-local-large