862ec78bac8109fd96f6d604da3f9089
This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [en-sv] dataset. It achieves the following results on the evaluation set:
- Loss: 3.0722
- Data Size: 1.0
- Epoch Runtime: 39.7266
- Bleu: 0.5534
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 223.9917 | 0 | 3.2371 | 0.0024 |
| No log | 1 | 77 | 202.5292 | 0.0078 | 3.8995 | 0.0043 |
| No log | 2 | 154 | 178.5526 | 0.0156 | 5.0033 | 0.0026 |
| No log | 3 | 231 | 154.7477 | 0.0312 | 6.6561 | 0.0027 |
| No log | 4 | 308 | 122.7218 | 0.0625 | 8.8027 | 0.0024 |
| No log | 5 | 385 | 71.8897 | 0.125 | 12.3707 | 0.0040 |
| 11.0887 | 6 | 462 | 29.8310 | 0.25 | 16.7527 | 0.0041 |
| 15.6486 | 7 | 539 | 15.4986 | 0.5 | 25.3639 | 0.0030 |
| 20.3585 | 8.0 | 616 | 11.0047 | 1.0 | 40.4686 | 0.0045 |
| 17.1318 | 9.0 | 693 | 9.3429 | 1.0 | 40.0204 | 0.0144 |
| 13.9398 | 10.0 | 770 | 8.7452 | 1.0 | 40.0600 | 0.0107 |
| 12.9449 | 11.0 | 847 | 7.9409 | 1.0 | 41.2014 | 0.0120 |
| 11.4343 | 12.0 | 924 | 7.5845 | 1.0 | 40.5638 | 0.0138 |
| 10.4966 | 13.0 | 1001 | 6.5332 | 1.0 | 40.3904 | 0.0194 |
| 10.1049 | 14.0 | 1078 | 6.4489 | 1.0 | 40.2994 | 0.0356 |
| 9.3497 | 15.0 | 1155 | 6.2815 | 1.0 | 39.4442 | 0.0545 |
| 9.0683 | 16.0 | 1232 | 5.8145 | 1.0 | 39.6385 | 0.0658 |
| 8.4766 | 17.0 | 1309 | 5.8225 | 1.0 | 39.6333 | 0.0533 |
| 8.2237 | 18.0 | 1386 | 5.3779 | 1.0 | 39.1000 | 0.1326 |
| 7.806 | 19.0 | 1463 | 5.2641 | 1.0 | 39.4449 | 0.1065 |
| 7.5591 | 20.0 | 1540 | 5.1220 | 1.0 | 39.8956 | 0.0928 |
| 7.1678 | 21.0 | 1617 | 5.0954 | 1.0 | 38.9583 | 0.0739 |
| 7.0647 | 22.0 | 1694 | 4.8270 | 1.0 | 39.4165 | 0.0900 |
| 6.6655 | 23.0 | 1771 | 4.7228 | 1.0 | 38.8910 | 0.1248 |
| 6.4936 | 24.0 | 1848 | 4.5843 | 1.0 | 39.0794 | 0.1653 |
| 6.2763 | 25.0 | 1925 | 4.6011 | 1.0 | 38.8424 | 0.2135 |
| 6.0459 | 26.0 | 2002 | 4.3207 | 1.0 | 39.4661 | 0.2475 |
| 5.8962 | 27.0 | 2079 | 4.1732 | 1.0 | 38.6549 | 0.2606 |
| 5.6867 | 28.0 | 2156 | 4.2663 | 1.0 | 39.7735 | 0.2386 |
| 5.5789 | 29.0 | 2233 | 4.2129 | 1.0 | 38.3106 | 0.1862 |
| 5.3694 | 30.0 | 2310 | 4.0500 | 1.0 | 39.1182 | 0.3258 |
| 5.2994 | 31.0 | 2387 | 3.8977 | 1.0 | 39.5137 | 0.2466 |
| 5.097 | 32.0 | 2464 | 3.7638 | 1.0 | 39.5681 | 0.3649 |
| 5.0502 | 33.0 | 2541 | 3.6888 | 1.0 | 38.6806 | 0.4253 |
| 4.9306 | 34.0 | 2618 | 3.8535 | 1.0 | 39.7200 | 0.2963 |
| 4.7802 | 35.0 | 2695 | 3.6710 | 1.0 | 38.4306 | 0.4275 |
| 4.6324 | 36.0 | 2772 | 3.6098 | 1.0 | 38.7097 | 0.3102 |
| 4.6086 | 37.0 | 2849 | 3.6183 | 1.0 | 38.4791 | 0.3835 |
| 4.4612 | 38.0 | 2926 | 3.4896 | 1.0 | 39.1358 | 0.4617 |
| 4.3899 | 39.0 | 3003 | 3.4306 | 1.0 | 39.7051 | 0.4606 |
| 4.3095 | 40.0 | 3080 | 3.4122 | 1.0 | 38.4065 | 0.4714 |
| 4.1906 | 41.0 | 3157 | 3.3982 | 1.0 | 39.5659 | 0.3783 |
| 4.1157 | 42.0 | 3234 | 3.3071 | 1.0 | 39.3906 | 0.5062 |
| 4.0427 | 43.0 | 3311 | 3.3112 | 1.0 | 39.0938 | 0.4346 |
| 3.9858 | 44.0 | 3388 | 3.3214 | 1.0 | 39.8291 | 0.4165 |
| 3.894 | 45.0 | 3465 | 3.3178 | 1.0 | 39.3775 | 0.4522 |
| 3.8483 | 46.0 | 3542 | 3.2357 | 1.0 | 40.2369 | 0.4297 |
| 3.7456 | 47.0 | 3619 | 3.1341 | 1.0 | 39.2569 | 0.5572 |
| 3.7164 | 48.0 | 3696 | 3.0997 | 1.0 | 39.4336 | 0.6396 |
| 3.6318 | 49.0 | 3773 | 3.0787 | 1.0 | 40.4189 | 0.5809 |
| 3.5901 | 50.0 | 3850 | 3.0722 | 1.0 | 39.7266 | 0.5534 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/862ec78bac8109fd96f6d604da3f9089
Base model
google/long-t5-local-large