ac011b5c3d74515833978506edd5a189
This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books dataset. It achieves the following results on the evaluation set:
- Loss: 1.7477
- Data Size: 1.0
- Epoch Runtime: 20.9875
- Bleu: 7.2494
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 3.6919 | 0 | 2.3837 | 0.2817 |
| No log | 1 | 77 | 3.6550 | 0.0078 | 2.6989 | 0.2857 |
| No log | 2 | 154 | 3.5567 | 0.0156 | 2.8912 | 0.2499 |
| No log | 3 | 231 | 3.4879 | 0.0312 | 3.3021 | 0.3033 |
| No log | 4 | 308 | 3.3999 | 0.0625 | 3.7415 | 0.4140 |
| No log | 5 | 385 | 3.2877 | 0.125 | 4.9392 | 0.7081 |
| 0.3282 | 6 | 462 | 3.1352 | 0.25 | 7.8107 | 0.8697 |
| 1.4232 | 7 | 539 | 2.9601 | 0.5 | 12.8925 | 0.9892 |
| 3.0986 | 8.0 | 616 | 2.7373 | 1.0 | 20.8772 | 1.3238 |
| 2.9416 | 9.0 | 693 | 2.5822 | 1.0 | 19.9890 | 2.1396 |
| 2.7286 | 10.0 | 770 | 2.4723 | 1.0 | 20.8685 | 2.7222 |
| 2.6411 | 11.0 | 847 | 2.3790 | 1.0 | 20.8805 | 2.8557 |
| 2.4996 | 12.0 | 924 | 2.3098 | 1.0 | 19.8070 | 3.0578 |
| 2.3682 | 13.0 | 1001 | 2.2397 | 1.0 | 19.8991 | 3.3912 |
| 2.314 | 14.0 | 1078 | 2.1845 | 1.0 | 21.0522 | 3.6963 |
| 2.2245 | 15.0 | 1155 | 2.1368 | 1.0 | 20.9468 | 4.1427 |
| 2.1665 | 16.0 | 1232 | 2.1020 | 1.0 | 20.3719 | 4.2497 |
| 2.0934 | 17.0 | 1309 | 2.0574 | 1.0 | 20.3863 | 4.6337 |
| 2.0375 | 18.0 | 1386 | 2.0166 | 1.0 | 21.5713 | 4.8467 |
| 1.959 | 19.0 | 1463 | 1.9812 | 1.0 | 21.4446 | 4.8427 |
| 1.9418 | 20.0 | 1540 | 1.9623 | 1.0 | 20.3362 | 5.0117 |
| 1.8663 | 21.0 | 1617 | 1.9344 | 1.0 | 20.9080 | 5.1227 |
| 1.8401 | 22.0 | 1694 | 1.9157 | 1.0 | 20.7733 | 5.2214 |
| 1.7756 | 23.0 | 1771 | 1.8923 | 1.0 | 21.6859 | 5.3765 |
| 1.7401 | 24.0 | 1848 | 1.8785 | 1.0 | 19.6916 | 5.5684 |
| 1.7017 | 25.0 | 1925 | 1.8578 | 1.0 | 21.2807 | 5.6905 |
| 1.6576 | 26.0 | 2002 | 1.8358 | 1.0 | 20.9607 | 5.7777 |
| 1.6186 | 27.0 | 2079 | 1.8285 | 1.0 | 21.4419 | 5.8534 |
| 1.5687 | 28.0 | 2156 | 1.8143 | 1.0 | 21.7423 | 5.8521 |
| 1.551 | 29.0 | 2233 | 1.8060 | 1.0 | 21.1729 | 6.0732 |
| 1.5108 | 30.0 | 2310 | 1.8009 | 1.0 | 21.8225 | 6.0521 |
| 1.496 | 31.0 | 2387 | 1.7827 | 1.0 | 21.1280 | 6.2218 |
| 1.4387 | 32.0 | 2464 | 1.7746 | 1.0 | 20.3537 | 6.2464 |
| 1.4301 | 33.0 | 2541 | 1.7809 | 1.0 | 20.7134 | 6.3561 |
| 1.3959 | 34.0 | 2618 | 1.7746 | 1.0 | 20.7294 | 6.5741 |
| 1.3755 | 35.0 | 2695 | 1.7601 | 1.0 | 21.0506 | 6.5561 |
| 1.3427 | 36.0 | 2772 | 1.7552 | 1.0 | 20.5285 | 6.7137 |
| 1.3187 | 37.0 | 2849 | 1.7441 | 1.0 | 21.9356 | 6.8087 |
| 1.2774 | 38.0 | 2926 | 1.7366 | 1.0 | 22.3031 | 6.8258 |
| 1.2588 | 39.0 | 3003 | 1.7361 | 1.0 | 21.2445 | 6.7728 |
| 1.2347 | 40.0 | 3080 | 1.7379 | 1.0 | 21.5544 | 6.8723 |
| 1.2156 | 41.0 | 3157 | 1.7426 | 1.0 | 22.1101 | 6.8853 |
| 1.1867 | 42.0 | 3234 | 1.7350 | 1.0 | 21.7368 | 6.9266 |
| 1.1686 | 43.0 | 3311 | 1.7336 | 1.0 | 21.1771 | 6.9280 |
| 1.1534 | 44.0 | 3388 | 1.7421 | 1.0 | 21.3334 | 7.0360 |
| 1.1272 | 45.0 | 3465 | 1.7443 | 1.0 | 21.3038 | 7.0424 |
| 1.1005 | 46.0 | 3542 | 1.7434 | 1.0 | 21.3993 | 7.0903 |
| 1.0742 | 47.0 | 3619 | 1.7236 | 1.0 | 20.4971 | 7.1741 |
| 1.0821 | 48.0 | 3696 | 1.7441 | 1.0 | 21.8380 | 7.2903 |
| 1.0375 | 49.0 | 3773 | 1.7433 | 1.0 | 21.4747 | 7.3077 |
| 1.0233 | 50.0 | 3850 | 1.7477 | 1.0 | 20.9875 | 7.2494 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/ac011b5c3d74515833978506edd5a189
Base model
google-t5/t5-base