cdbd9519200627023626eaf4f5a9c2ec

This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books [de-en] dataset. It achieves the following results on the evaluation set:

  • Loss: 1.8382
  • Data Size: 1.0
  • Epoch Runtime: 296.1652
  • Bleu: 10.3689

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 3.2834 0 32.3240 3.1665
No log 1 1286 2.8676 0.0078 31.2946 4.9302
0.061 2 2572 2.6478 0.0156 39.1063 6.0021
0.0781 3 3858 2.5222 0.0312 40.5774 6.1777
0.1162 4 5144 2.4145 0.0625 44.6000 5.9555
2.5374 5 6430 2.3278 0.125 71.0763 6.6924
2.4421 6 7716 2.2364 0.25 114.0874 6.9143
2.3398 7 9002 2.1406 0.5 203.1443 7.6527
2.2267 8.0 10288 2.0375 1.0 320.0082 8.2874
2.1245 9.0 11574 1.9762 1.0 327.2068 8.7267
2.0126 10.0 12860 1.9342 1.0 318.2429 9.0232
1.9288 11.0 14146 1.9076 1.0 313.7659 9.4612
1.9147 12.0 15432 1.8856 1.0 306.3946 9.5535
1.8118 13.0 16718 1.8670 1.0 301.9939 9.6642
1.7869 14.0 18004 1.8577 1.0 303.7155 9.7512
1.7684 15.0 19290 1.8451 1.0 311.2393 9.8912
1.7035 16.0 20576 1.8364 1.0 301.0892 10.0150
1.6895 17.0 21862 1.8324 1.0 312.8599 10.0120
1.6357 18.0 23148 1.8328 1.0 298.5094 10.0325
1.615 19.0 24434 1.8274 1.0 303.1685 10.1694
1.5417 20.0 25720 1.8263 1.0 301.3516 10.1965
1.5421 21.0 27006 1.8260 1.0 308.1764 10.1196
1.5266 22.0 28292 1.8275 1.0 300.0012 10.2606
1.4999 23.0 29578 1.8345 1.0 303.9673 10.3856
1.4494 24.0 30864 1.8391 1.0 291.5877 10.4315
1.4225 25.0 32150 1.8382 1.0 296.1652 10.3689

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/cdbd9519200627023626eaf4f5a9c2ec

Finetuned
(729)
this model