93dab9f25502fb3dde0b0f1a467d67cc

This model is a fine-tuned version of google-t5/t5-large on the Helsinki-NLP/opus_books [en-pt] dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9916
  • Data Size: 1.0
  • Epoch Runtime: 22.6504
  • Bleu: 18.4079

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 2.0942 0 2.0070 4.1931
No log 1 35 2.0200 0.0078 2.6001 4.0017
No log 2 70 1.8950 0.0156 3.2940 4.4829
No log 3 105 1.7346 0.0312 5.4302 5.1878
No log 4 140 1.6203 0.0625 7.8263 6.0648
No log 5 175 1.5167 0.125 9.0388 8.2982
No log 6 210 1.4429 0.25 11.1934 9.8345
No log 7 245 1.3368 0.5 15.2762 11.3725
0.3179 8.0 280 1.1997 1.0 21.9345 12.9820
1.4161 9.0 315 1.1268 1.0 19.8059 13.4222
1.2184 10.0 350 1.0754 1.0 22.1018 14.6163
1.2184 11.0 385 1.0473 1.0 21.2029 14.3354
1.0558 12.0 420 1.0314 1.0 21.5731 15.6313
0.9325 13.0 455 1.0132 1.0 21.2095 16.5828
0.9325 14.0 490 1.0068 1.0 20.3950 16.2022
0.8349 15.0 525 0.9867 1.0 20.8965 17.0375
0.7558 16.0 560 0.9902 1.0 20.8363 17.0010
0.7558 17.0 595 0.9942 1.0 20.8230 16.6790
0.6726 18.0 630 0.9893 1.0 20.6770 16.9605
0.5971 19.0 665 0.9916 1.0 22.6504 18.4079

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
1
Safetensors
Model size
0.8B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/93dab9f25502fb3dde0b0f1a467d67cc

Finetuned
(171)
this model