6d24a5eb768a198ae73b54fa54072735
This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [es-nl] dataset. It achieves the following results on the evaluation set:
- Loss: 1.9430
- Data Size: 1.0
- Epoch Runtime: 349.3459
- Bleu: 1.4024
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 218.2100 | 0 | 25.0722 | 0.0029 |
| No log | 1 | 806 | 135.8632 | 0.0078 | 28.4186 | 0.0026 |
| No log | 2 | 1612 | 72.6377 | 0.0156 | 32.3281 | 0.0012 |
| No log | 3 | 2418 | 25.0309 | 0.0312 | 38.3946 | 0.0007 |
| 2.2782 | 4 | 3224 | 14.2780 | 0.0625 | 48.6758 | 0.0033 |
| 18.3293 | 5 | 4030 | 10.1719 | 0.125 | 69.3327 | 0.0069 |
| 12.7345 | 6 | 4836 | 7.7438 | 0.25 | 108.8321 | 0.0237 |
| 8.7676 | 7 | 5642 | 5.8010 | 0.5 | 189.2463 | 0.0223 |
| 5.9844 | 8.0 | 6448 | 4.0580 | 1.0 | 348.8722 | 0.0502 |
| 4.6451 | 9.0 | 7254 | 3.4634 | 1.0 | 352.2134 | 0.0946 |
| 4.0376 | 10.0 | 8060 | 3.0997 | 1.0 | 353.1616 | 0.1436 |
| 3.6333 | 11.0 | 8866 | 2.8809 | 1.0 | 350.2432 | 0.1828 |
| 3.4008 | 12.0 | 9672 | 2.7696 | 1.0 | 351.4368 | 0.1934 |
| 3.2335 | 13.0 | 10478 | 2.6920 | 1.0 | 350.9128 | 0.2562 |
| 3.0756 | 14.0 | 11284 | 2.6157 | 1.0 | 352.3415 | 0.2855 |
| 2.9579 | 15.0 | 12090 | 2.5355 | 1.0 | 351.0503 | 0.3491 |
| 2.8798 | 16.0 | 12896 | 2.5249 | 1.0 | 351.3586 | 0.3560 |
| 2.7897 | 17.0 | 13702 | 2.4580 | 1.0 | 350.1344 | 0.4317 |
| 2.7262 | 18.0 | 14508 | 2.4061 | 1.0 | 351.1663 | 0.4623 |
| 2.6962 | 19.0 | 15314 | 2.3797 | 1.0 | 349.7022 | 0.5310 |
| 2.6126 | 20.0 | 16120 | 2.3499 | 1.0 | 351.0222 | 0.5647 |
| 2.5806 | 21.0 | 16926 | 2.3145 | 1.0 | 351.7137 | 0.5577 |
| 2.5476 | 22.0 | 17732 | 2.2874 | 1.0 | 351.7715 | 0.5803 |
| 2.5206 | 23.0 | 18538 | 2.2583 | 1.0 | 353.4597 | 0.6327 |
| 2.4558 | 24.0 | 19344 | 2.2414 | 1.0 | 353.0918 | 0.6507 |
| 2.4214 | 25.0 | 20150 | 2.2215 | 1.0 | 352.0976 | 0.6475 |
| 2.3758 | 26.0 | 20956 | 2.1898 | 1.0 | 348.2238 | 0.7510 |
| 2.3366 | 27.0 | 21762 | 2.1735 | 1.0 | 349.3065 | 0.7297 |
| 2.3167 | 28.0 | 22568 | 2.1639 | 1.0 | 349.5059 | 0.7622 |
| 2.2714 | 29.0 | 23374 | 2.1394 | 1.0 | 351.9686 | 0.7805 |
| 2.2556 | 30.0 | 24180 | 2.1266 | 1.0 | 352.5284 | 0.8544 |
| 2.2305 | 31.0 | 24986 | 2.1105 | 1.0 | 355.9961 | 0.8285 |
| 2.1928 | 32.0 | 25792 | 2.1064 | 1.0 | 355.1299 | 0.8994 |
| 2.1717 | 33.0 | 26598 | 2.0856 | 1.0 | 351.6526 | 0.9018 |
| 2.1266 | 34.0 | 27404 | 2.0632 | 1.0 | 354.7243 | 0.9453 |
| 2.1371 | 35.0 | 28210 | 2.0498 | 1.0 | 352.0551 | 0.9594 |
| 2.0909 | 36.0 | 29016 | 2.0523 | 1.0 | 350.2846 | 0.9577 |
| 2.0623 | 37.0 | 29822 | 2.0360 | 1.0 | 351.6120 | 1.0107 |
| 2.0601 | 38.0 | 30628 | 2.0212 | 1.0 | 352.5911 | 1.0315 |
| 2.0236 | 39.0 | 31434 | 2.0179 | 1.0 | 349.9444 | 1.0745 |
| 1.9903 | 40.0 | 32240 | 1.9996 | 1.0 | 351.4016 | 1.1236 |
| 1.9811 | 41.0 | 33046 | 1.9928 | 1.0 | 351.0448 | 1.1127 |
| 1.9425 | 42.0 | 33852 | 1.9838 | 1.0 | 351.3833 | 1.1870 |
| 1.92 | 43.0 | 34658 | 1.9741 | 1.0 | 349.1373 | 1.2144 |
| 1.8854 | 44.0 | 35464 | 1.9738 | 1.0 | 352.3969 | 1.2406 |
| 1.8909 | 45.0 | 36270 | 1.9711 | 1.0 | 353.0046 | 1.2970 |
| 1.8646 | 46.0 | 37076 | 1.9539 | 1.0 | 349.8124 | 1.3181 |
| 1.8158 | 47.0 | 37882 | 1.9544 | 1.0 | 349.7610 | 1.3287 |
| 1.8049 | 48.0 | 38688 | 1.9410 | 1.0 | 346.4040 | 1.3900 |
| 1.7911 | 49.0 | 39494 | 1.9471 | 1.0 | 347.3523 | 1.4239 |
| 1.7596 | 50.0 | 40300 | 1.9430 | 1.0 | 349.3459 | 1.4024 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/6d24a5eb768a198ae73b54fa54072735
Base model
google/long-t5-local-large