bf1462d89aff6f3487022728c0e619d3
This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books [es-ru] dataset. It achieves the following results on the evaluation set:
- Loss: 1.2867
- Data Size: 1.0
- Epoch Runtime: 102.8343
- Bleu: 4.8831
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 3.8321 | 0 | 7.7511 | 0.2778 |
| No log | 1 | 419 | 3.1758 | 0.0078 | 10.2457 | 0.2173 |
| No log | 2 | 838 | 2.4281 | 0.0156 | 9.8304 | 0.1990 |
| 0.0781 | 3 | 1257 | 2.2940 | 0.0312 | 11.6407 | 0.4073 |
| 0.0781 | 4 | 1676 | 2.1725 | 0.0625 | 14.9945 | 0.4680 |
| 0.1468 | 5 | 2095 | 2.0765 | 0.125 | 19.9688 | 0.4470 |
| 0.2969 | 6 | 2514 | 1.9815 | 0.25 | 30.7923 | 0.4445 |
| 2.055 | 7 | 2933 | 1.8771 | 0.5 | 52.1986 | 1.2337 |
| 1.908 | 8.0 | 3352 | 1.7673 | 1.0 | 98.1835 | 1.7631 |
| 1.8323 | 9.0 | 3771 | 1.6963 | 1.0 | 104.2326 | 2.0477 |
| 1.7914 | 10.0 | 4190 | 1.6447 | 1.0 | 104.3759 | 2.1788 |
| 1.7083 | 11.0 | 4609 | 1.6005 | 1.0 | 97.1955 | 2.3249 |
| 1.6842 | 12.0 | 5028 | 1.5647 | 1.0 | 98.2808 | 2.5057 |
| 1.6198 | 13.0 | 5447 | 1.5387 | 1.0 | 98.4246 | 2.7196 |
| 1.6071 | 14.0 | 5866 | 1.5071 | 1.0 | 98.7859 | 2.8021 |
| 1.5506 | 15.0 | 6285 | 1.4813 | 1.0 | 103.0710 | 2.9360 |
| 1.5343 | 16.0 | 6704 | 1.4627 | 1.0 | 103.5573 | 2.9611 |
| 1.5054 | 17.0 | 7123 | 1.4434 | 1.0 | 100.6397 | 3.2404 |
| 1.4638 | 18.0 | 7542 | 1.4301 | 1.0 | 102.1796 | 3.3761 |
| 1.4393 | 19.0 | 7961 | 1.4137 | 1.0 | 101.3366 | 3.4453 |
| 1.4032 | 20.0 | 8380 | 1.4006 | 1.0 | 107.1915 | 3.5435 |
| 1.4009 | 21.0 | 8799 | 1.3863 | 1.0 | 102.6508 | 3.6507 |
| 1.3719 | 22.0 | 9218 | 1.3724 | 1.0 | 102.2086 | 3.6668 |
| 1.362 | 23.0 | 9637 | 1.3700 | 1.0 | 99.3133 | 3.8352 |
| 1.3195 | 24.0 | 10056 | 1.3549 | 1.0 | 103.5616 | 3.9424 |
| 1.3178 | 25.0 | 10475 | 1.3464 | 1.0 | 98.7277 | 3.9231 |
| 1.2825 | 26.0 | 10894 | 1.3382 | 1.0 | 108.8320 | 4.1158 |
| 1.2786 | 27.0 | 11313 | 1.3299 | 1.0 | 109.9823 | 4.0274 |
| 1.2738 | 28.0 | 11732 | 1.3250 | 1.0 | 107.1214 | 4.1404 |
| 1.256 | 29.0 | 12151 | 1.3199 | 1.0 | 101.2873 | 4.2051 |
| 1.2284 | 30.0 | 12570 | 1.3160 | 1.0 | 101.6196 | 4.2800 |
| 1.2079 | 31.0 | 12989 | 1.3049 | 1.0 | 99.4647 | 4.3756 |
| 1.2035 | 32.0 | 13408 | 1.3039 | 1.0 | 98.0781 | 4.3166 |
| 1.1767 | 33.0 | 13827 | 1.3071 | 1.0 | 97.6157 | 4.4239 |
| 1.1486 | 34.0 | 14246 | 1.2912 | 1.0 | 101.8700 | 4.4547 |
| 1.1646 | 35.0 | 14665 | 1.2933 | 1.0 | 102.8562 | 4.5101 |
| 1.1339 | 36.0 | 15084 | 1.2922 | 1.0 | 101.9525 | 4.5820 |
| 1.1166 | 37.0 | 15503 | 1.2862 | 1.0 | 105.2984 | 4.5991 |
| 1.1019 | 38.0 | 15922 | 1.2860 | 1.0 | 101.3865 | 4.5862 |
| 1.0878 | 39.0 | 16341 | 1.2851 | 1.0 | 98.2697 | 4.6133 |
| 1.0754 | 40.0 | 16760 | 1.2796 | 1.0 | 99.6733 | 4.6847 |
| 1.0683 | 41.0 | 17179 | 1.2873 | 1.0 | 99.9450 | 4.7070 |
| 1.0581 | 42.0 | 17598 | 1.2823 | 1.0 | 98.5332 | 4.7370 |
| 1.0576 | 43.0 | 18017 | 1.2793 | 1.0 | 98.4368 | 4.7392 |
| 1.032 | 44.0 | 18436 | 1.2821 | 1.0 | 105.1789 | 4.8431 |
| 1.0127 | 45.0 | 18855 | 1.2795 | 1.0 | 100.6899 | 4.8298 |
| 1.0096 | 46.0 | 19274 | 1.2804 | 1.0 | 104.6137 | 4.8207 |
| 0.9995 | 47.0 | 19693 | 1.2731 | 1.0 | 103.7977 | 4.7398 |
| 0.9828 | 48.0 | 20112 | 1.2887 | 1.0 | 108.5207 | 4.9181 |
| 0.9896 | 49.0 | 20531 | 1.2801 | 1.0 | 105.9333 | 4.7891 |
| 0.9738 | 50.0 | 20950 | 1.2867 | 1.0 | 102.8343 | 4.8831 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/bf1462d89aff6f3487022728c0e619d3
Base model
google-t5/t5-base