112fec9ed42e51060f900d8f894eb446

This model is a fine-tuned version of google-t5/t5-large on the Helsinki-NLP/opus_books [de-pt] dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6703
  • Data Size: 1.0
  • Epoch Runtime: 21.2101
  • Bleu: 7.4410

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 2.9596 0 1.7629 1.0230
No log 1 27 2.8965 0.0078 2.3567 1.1778
No log 2 54 2.7601 0.0156 3.8914 1.4456
No log 3 81 2.6535 0.0312 5.1172 1.7848
No log 4 108 2.5314 0.0625 6.3784 1.8423
No log 5 135 2.3265 0.125 8.5023 2.5052
No log 6 162 2.1458 0.25 11.4743 3.1025
No log 7 189 2.0311 0.5 15.1548 3.1577
0.5013 8.0 216 1.9043 1.0 20.2497 4.2677
0.5013 9.0 243 1.8183 1.0 18.0056 5.2331
1.9772 10.0 270 1.7607 1.0 18.4100 5.7080
1.9772 11.0 297 1.7316 1.0 19.6296 5.9091
1.7436 12.0 324 1.7052 1.0 18.8718 6.4142
1.5745 13.0 351 1.6868 1.0 19.1276 6.2425
1.5745 14.0 378 1.6810 1.0 19.4471 6.6954
1.4445 15.0 405 1.6673 1.0 19.4834 6.7889
1.4445 16.0 432 1.6606 1.0 20.6860 6.6549
1.3433 17.0 459 1.6665 1.0 22.0664 6.6621
1.3433 18.0 486 1.6586 1.0 20.9229 6.8744
1.2301 19.0 513 1.6651 1.0 21.5009 7.0757
1.2301 20.0 540 1.6635 1.0 21.4612 6.7474
1.1481 21.0 567 1.6693 1.0 20.9774 7.3536
1.1481 22.0 594 1.6703 1.0 21.2101 7.4410

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
0.8B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/112fec9ed42e51060f900d8f894eb446

Finetuned
(171)
this model