3a9bd0d94ea11766e0113a915c2b0f91
This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [de-pt] dataset. It achieves the following results on the evaluation set:
- Loss: 4.5661
- Data Size: 1.0
- Epoch Runtime: 18.9176
- Bleu: 0.8563
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 240.5146 | 0 | 1.5895 | 0.0052 |
| No log | 1 | 27 | 220.2246 | 0.0078 | 2.3560 | 0.0049 |
| No log | 2 | 54 | 201.6398 | 0.0156 | 3.4008 | 0.0046 |
| No log | 3 | 81 | 185.7308 | 0.0312 | 4.7109 | 0.0043 |
| No log | 4 | 108 | 160.8329 | 0.0625 | 6.1394 | 0.0044 |
| No log | 5 | 135 | 126.0964 | 0.125 | 8.2139 | 0.0044 |
| No log | 6 | 162 | 80.8064 | 0.25 | 9.8625 | 0.0078 |
| No log | 7 | 189 | 38.3810 | 0.5 | 13.0379 | 0.0050 |
| 21.5309 | 8.0 | 216 | 20.5337 | 1.0 | 19.3391 | 0.0070 |
| 21.5309 | 9.0 | 243 | 15.6982 | 1.0 | 18.6150 | 0.0756 |
| 29.9211 | 10.0 | 270 | 13.9766 | 1.0 | 19.0162 | 0.0394 |
| 29.9211 | 11.0 | 297 | 12.7335 | 1.0 | 18.1427 | 0.0363 |
| 20.664 | 12.0 | 324 | 11.3376 | 1.0 | 18.4502 | 0.0323 |
| 17.2601 | 13.0 | 351 | 11.3032 | 1.0 | 19.0537 | 0.0499 |
| 17.2601 | 14.0 | 378 | 9.5367 | 1.0 | 18.0391 | 0.0428 |
| 15.1396 | 15.0 | 405 | 9.7215 | 1.0 | 18.4896 | 0.0435 |
| 15.1396 | 16.0 | 432 | 8.9985 | 1.0 | 18.2890 | 0.0741 |
| 13.6693 | 17.0 | 459 | 8.7178 | 1.0 | 18.5686 | 0.0452 |
| 13.6693 | 18.0 | 486 | 8.0204 | 1.0 | 18.3974 | 0.1564 |
| 12.5207 | 19.0 | 513 | 7.8316 | 1.0 | 18.2894 | 0.1316 |
| 12.5207 | 20.0 | 540 | 7.6137 | 1.0 | 18.3578 | 0.2135 |
| 11.6421 | 21.0 | 567 | 7.3559 | 1.0 | 19.0919 | 0.2169 |
| 11.6421 | 22.0 | 594 | 7.2481 | 1.0 | 18.2427 | 0.3064 |
| 10.8325 | 23.0 | 621 | 7.3813 | 1.0 | 18.1103 | 0.3937 |
| 10.8325 | 24.0 | 648 | 6.6429 | 1.0 | 18.8096 | 0.4088 |
| 10.1643 | 25.0 | 675 | 6.5005 | 1.0 | 18.8014 | 0.5785 |
| 9.6446 | 26.0 | 702 | 6.7132 | 1.0 | 19.0756 | 0.2251 |
| 9.6446 | 27.0 | 729 | 6.4120 | 1.0 | 19.3266 | 0.4350 |
| 9.1617 | 28.0 | 756 | 6.2314 | 1.0 | 18.5467 | 0.6547 |
| 9.1617 | 29.0 | 783 | 5.9144 | 1.0 | 18.3461 | 0.5681 |
| 8.6776 | 30.0 | 810 | 6.0467 | 1.0 | 18.5372 | 0.4646 |
| 8.6776 | 31.0 | 837 | 5.9735 | 1.0 | 18.6964 | 0.3894 |
| 8.3499 | 32.0 | 864 | 5.8220 | 1.0 | 19.0345 | 0.4047 |
| 8.3499 | 33.0 | 891 | 5.8745 | 1.0 | 18.9467 | 0.5463 |
| 7.9403 | 34.0 | 918 | 5.5877 | 1.0 | 19.5259 | 0.5067 |
| 7.9403 | 35.0 | 945 | 5.5054 | 1.0 | 18.3917 | 0.5121 |
| 7.68 | 36.0 | 972 | 5.3874 | 1.0 | 19.1240 | 0.6175 |
| 7.68 | 37.0 | 999 | 5.6432 | 1.0 | 18.7083 | 0.4894 |
| 7.3719 | 38.0 | 1026 | 5.4467 | 1.0 | 18.7202 | 0.6100 |
| 7.0996 | 39.0 | 1053 | 5.1762 | 1.0 | 18.9310 | 0.7550 |
| 7.0996 | 40.0 | 1080 | 5.5259 | 1.0 | 18.5685 | 0.5951 |
| 6.8975 | 41.0 | 1107 | 5.3219 | 1.0 | 18.5159 | 0.6732 |
| 6.8975 | 42.0 | 1134 | 4.9517 | 1.0 | 18.7211 | 0.5251 |
| 6.6581 | 43.0 | 1161 | 4.8695 | 1.0 | 18.6375 | 0.7514 |
| 6.6581 | 44.0 | 1188 | 5.0998 | 1.0 | 18.6603 | 0.9022 |
| 6.4351 | 45.0 | 1215 | 4.8481 | 1.0 | 18.9359 | 0.7697 |
| 6.4351 | 46.0 | 1242 | 5.0892 | 1.0 | 19.1512 | 0.5868 |
| 6.2412 | 47.0 | 1269 | 4.8191 | 1.0 | 18.7786 | 0.7112 |
| 6.2412 | 48.0 | 1296 | 4.7702 | 1.0 | 18.8725 | 0.8425 |
| 6.0237 | 49.0 | 1323 | 4.5152 | 1.0 | 18.7832 | 0.6652 |
| 5.8837 | 50.0 | 1350 | 4.5661 | 1.0 | 18.9176 | 0.8563 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/3a9bd0d94ea11766e0113a915c2b0f91
Base model
google/long-t5-local-large