549aaa3ce14cc70304a30ca95b87074c
This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [en-fi] dataset. It achieves the following results on the evaluation set:
- Loss: 2.9341
- Data Size: 1.0
- Epoch Runtime: 46.6996
- Bleu: 0.9318
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 211.0632 | 0 | 3.5779 | 0.0016 |
| No log | 1 | 91 | 188.7779 | 0.0078 | 4.1075 | 0.0019 |
| No log | 2 | 182 | 169.3628 | 0.0156 | 5.8967 | 0.0022 |
| No log | 3 | 273 | 149.8084 | 0.0312 | 7.5024 | 0.0022 |
| No log | 4 | 364 | 117.3038 | 0.0625 | 9.7911 | 0.0021 |
| No log | 5 | 455 | 67.4181 | 0.125 | 13.6960 | 0.0019 |
| No log | 6 | 546 | 29.9486 | 0.25 | 18.0287 | 0.0028 |
| 9.5787 | 7 | 637 | 14.3971 | 0.5 | 28.2492 | 0.0080 |
| 19.5655 | 8.0 | 728 | 10.7885 | 1.0 | 47.3220 | 0.0082 |
| 14.8619 | 9.0 | 819 | 8.7578 | 1.0 | 46.3086 | 0.0080 |
| 12.7107 | 10.0 | 910 | 8.1946 | 1.0 | 46.3137 | 0.0144 |
| 11.1516 | 11.0 | 1001 | 7.0157 | 1.0 | 46.2185 | 0.0318 |
| 10.7591 | 12.0 | 1092 | 6.7510 | 1.0 | 45.4156 | 0.0190 |
| 9.8214 | 13.0 | 1183 | 6.1254 | 1.0 | 46.6275 | 0.0629 |
| 9.0841 | 14.0 | 1274 | 6.1219 | 1.0 | 45.7993 | 0.0495 |
| 8.5117 | 15.0 | 1365 | 5.4160 | 1.0 | 46.0273 | 0.0841 |
| 8.0009 | 16.0 | 1456 | 5.3270 | 1.0 | 46.2666 | 0.1435 |
| 7.7391 | 17.0 | 1547 | 5.1370 | 1.0 | 46.2303 | 0.1310 |
| 7.3406 | 18.0 | 1638 | 4.9944 | 1.0 | 46.1756 | 0.1327 |
| 6.9568 | 19.0 | 1729 | 4.6342 | 1.0 | 45.9808 | 0.2060 |
| 6.6468 | 20.0 | 1820 | 4.5277 | 1.0 | 46.7813 | 0.2704 |
| 6.3802 | 21.0 | 1911 | 4.3945 | 1.0 | 46.3165 | 0.2054 |
| 6.0991 | 22.0 | 2002 | 4.2401 | 1.0 | 46.1848 | 0.1644 |
| 5.9951 | 23.0 | 2093 | 4.0383 | 1.0 | 45.9308 | 0.3349 |
| 5.7585 | 24.0 | 2184 | 4.0444 | 1.0 | 46.1250 | 0.3045 |
| 5.5825 | 25.0 | 2275 | 4.0388 | 1.0 | 46.0497 | 0.2623 |
| 5.3925 | 26.0 | 2366 | 3.9272 | 1.0 | 45.7610 | 0.4223 |
| 5.2605 | 27.0 | 2457 | 3.7667 | 1.0 | 46.2910 | 0.3692 |
| 5.1225 | 28.0 | 2548 | 3.7613 | 1.0 | 45.6301 | 0.4512 |
| 4.9741 | 29.0 | 2639 | 3.6562 | 1.0 | 46.4286 | 0.3588 |
| 4.8417 | 30.0 | 2730 | 3.5531 | 1.0 | 46.1431 | 0.5768 |
| 4.6741 | 31.0 | 2821 | 3.6014 | 1.0 | 46.8742 | 0.4160 |
| 4.6209 | 32.0 | 2912 | 3.4647 | 1.0 | 46.7312 | 0.6189 |
| 4.4791 | 33.0 | 3003 | 3.4088 | 1.0 | 46.6921 | 0.7235 |
| 4.346 | 34.0 | 3094 | 3.3616 | 1.0 | 45.7728 | 0.5786 |
| 4.311 | 35.0 | 3185 | 3.3856 | 1.0 | 47.5342 | 0.5463 |
| 4.1986 | 36.0 | 3276 | 3.3260 | 1.0 | 46.0606 | 0.6678 |
| 4.1229 | 37.0 | 3367 | 3.2019 | 1.0 | 46.4823 | 0.7473 |
| 4.0017 | 38.0 | 3458 | 3.2359 | 1.0 | 46.6339 | 0.6610 |
| 3.9321 | 39.0 | 3549 | 3.1598 | 1.0 | 46.3126 | 0.7617 |
| 3.8785 | 40.0 | 3640 | 3.1266 | 1.0 | 45.4446 | 0.8805 |
| 3.7769 | 41.0 | 3731 | 3.0599 | 1.0 | 46.1482 | 0.7153 |
| 3.7148 | 42.0 | 3822 | 3.0918 | 1.0 | 45.4459 | 0.8746 |
| 3.6051 | 43.0 | 3913 | 3.0205 | 1.0 | 46.5096 | 0.9528 |
| 3.5761 | 44.0 | 4004 | 3.0466 | 1.0 | 46.2574 | 0.8161 |
| 3.5664 | 45.0 | 4095 | 2.9364 | 1.0 | 46.3092 | 1.0224 |
| 3.4499 | 46.0 | 4186 | 2.9338 | 1.0 | 45.6281 | 0.9838 |
| 3.3732 | 47.0 | 4277 | 2.9169 | 1.0 | 45.4441 | 0.9263 |
| 3.3295 | 48.0 | 4368 | 2.9877 | 1.0 | 46.7569 | 0.7916 |
| 3.2911 | 49.0 | 4459 | 2.9030 | 1.0 | 46.8190 | 0.7812 |
| 3.2407 | 50.0 | 4550 | 2.9341 | 1.0 | 46.6996 | 0.9318 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/549aaa3ce14cc70304a30ca95b87074c
Base model
google/long-t5-local-large