33562f6b9fce67a8c253d38dc51d877c
This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [fr-it] dataset. It achieves the following results on the evaluation set:
- Loss: 2.5661
- Data Size: 1.0
- Epoch Runtime: 162.8726
- Bleu: 0.7953
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 227.9474 | 0 | 12.0214 | 0.0077 |
| No log | 1 | 367 | 183.7979 | 0.0078 | 13.2936 | 0.0065 |
| No log | 2 | 734 | 134.7391 | 0.0156 | 15.3314 | 0.0032 |
| No log | 3 | 1101 | 69.4337 | 0.0312 | 18.8234 | 0.0011 |
| No log | 4 | 1468 | 29.1306 | 0.0625 | 23.7774 | 0.0005 |
| 3.8408 | 5 | 1835 | 14.9830 | 0.125 | 33.6860 | 0.0155 |
| 20.8276 | 6 | 2202 | 11.2351 | 0.25 | 51.7824 | 0.1309 |
| 14.5113 | 7 | 2569 | 9.0706 | 0.5 | 89.2183 | 0.0839 |
| 10.3356 | 8.0 | 2936 | 6.5654 | 1.0 | 162.2343 | 0.0607 |
| 8.1005 | 9.0 | 3303 | 5.6250 | 1.0 | 160.6285 | 0.1078 |
| 6.9191 | 10.0 | 3670 | 4.6990 | 1.0 | 160.3902 | 0.1321 |
| 6.1299 | 11.0 | 4037 | 4.3165 | 1.0 | 159.7420 | 0.1422 |
| 5.4527 | 12.0 | 4404 | 3.9367 | 1.0 | 160.9828 | 0.1397 |
| 5.0146 | 13.0 | 4771 | 3.7508 | 1.0 | 160.3557 | 0.1540 |
| 4.7112 | 14.0 | 5138 | 3.5266 | 1.0 | 160.8869 | 0.2144 |
| 4.4185 | 15.0 | 5505 | 3.4124 | 1.0 | 160.9423 | 0.1896 |
| 4.2091 | 16.0 | 5872 | 3.3365 | 1.0 | 161.2574 | 0.2529 |
| 4.0543 | 17.0 | 6239 | 3.2665 | 1.0 | 161.5701 | 0.2257 |
| 3.908 | 18.0 | 6606 | 3.1801 | 1.0 | 160.9361 | 0.2362 |
| 3.7735 | 19.0 | 6973 | 3.2065 | 1.0 | 161.9550 | 0.1961 |
| 3.692 | 20.0 | 7340 | 3.0479 | 1.0 | 161.8171 | 0.3107 |
| 3.581 | 21.0 | 7707 | 3.0544 | 1.0 | 163.2240 | 0.2429 |
| 3.5182 | 22.0 | 8074 | 2.9941 | 1.0 | 163.6884 | 0.3816 |
| 3.435 | 23.0 | 8441 | 2.9701 | 1.0 | 163.1442 | 0.3846 |
| 3.3766 | 24.0 | 8808 | 2.9647 | 1.0 | 162.1650 | 0.2901 |
| 3.3338 | 25.0 | 9175 | 2.9095 | 1.0 | 161.4591 | 0.3830 |
| 3.2912 | 26.0 | 9542 | 2.9067 | 1.0 | 161.0684 | 0.3462 |
| 3.2124 | 27.0 | 9909 | 2.8573 | 1.0 | 162.8073 | 0.4296 |
| 3.1835 | 28.0 | 10276 | 2.8323 | 1.0 | 160.5233 | 0.3739 |
| 3.1464 | 29.0 | 10643 | 2.8341 | 1.0 | 161.7101 | 0.3739 |
| 3.0991 | 30.0 | 11010 | 2.8026 | 1.0 | 161.8965 | 0.5228 |
| 3.059 | 31.0 | 11377 | 2.7899 | 1.0 | 161.3309 | 0.4951 |
| 3.0268 | 32.0 | 11744 | 2.7772 | 1.0 | 162.3818 | 0.5396 |
| 2.9941 | 33.0 | 12111 | 2.7470 | 1.0 | 160.4514 | 0.6003 |
| 2.9486 | 34.0 | 12478 | 2.7180 | 1.0 | 160.8334 | 0.5596 |
| 2.9266 | 35.0 | 12845 | 2.7145 | 1.0 | 161.7326 | 0.6708 |
| 2.8847 | 36.0 | 13212 | 2.7008 | 1.0 | 162.2701 | 0.5416 |
| 2.8645 | 37.0 | 13579 | 2.6849 | 1.0 | 162.0892 | 0.6175 |
| 2.8496 | 38.0 | 13946 | 2.6890 | 1.0 | 162.7706 | 0.5850 |
| 2.81 | 39.0 | 14313 | 2.6759 | 1.0 | 162.3887 | 0.6315 |
| 2.7744 | 40.0 | 14680 | 2.6479 | 1.0 | 161.6776 | 0.6514 |
| 2.754 | 41.0 | 15047 | 2.6493 | 1.0 | 162.5414 | 0.6169 |
| 2.7357 | 42.0 | 15414 | 2.6309 | 1.0 | 163.1594 | 0.6945 |
| 2.686 | 43.0 | 15781 | 2.6149 | 1.0 | 165.9331 | 0.6736 |
| 2.6839 | 44.0 | 16148 | 2.6144 | 1.0 | 163.4045 | 0.6818 |
| 2.662 | 45.0 | 16515 | 2.6034 | 1.0 | 161.7051 | 0.6951 |
| 2.6385 | 46.0 | 16882 | 2.5922 | 1.0 | 162.0199 | 0.7377 |
| 2.6123 | 47.0 | 17249 | 2.5817 | 1.0 | 160.6540 | 0.6980 |
| 2.5896 | 48.0 | 17616 | 2.5717 | 1.0 | 161.6729 | 0.7286 |
| 2.5732 | 49.0 | 17983 | 2.5840 | 1.0 | 162.0004 | 0.7646 |
| 2.5499 | 50.0 | 18350 | 2.5661 | 1.0 | 162.8726 | 0.7953 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/33562f6b9fce67a8c253d38dc51d877c
Base model
google/long-t5-local-large