db020be9e4e35d83e65d19fab946b815

This model is a fine-tuned version of google/mt5-base on the Helsinki-NLP/opus_books [en-it] dataset. It achieves the following results on the evaluation set:

  • Loss: 1.8069
  • Data Size: 1.0
  • Epoch Runtime: 180.3731
  • Bleu: 6.5813

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 17.6779 0 14.1149 0.0142
No log 1 808 15.1169 0.0078 16.4196 0.0167
No log 2 1616 12.7227 0.0156 17.2645 0.0185
No log 3 2424 9.3009 0.0312 20.6988 0.0207
0.3854 4 3232 6.4219 0.0625 25.8235 0.0170
5.5513 5 4040 2.9263 0.125 35.1693 0.5854
3.519 6 4848 2.5624 0.25 55.7010 2.1025
3.0277 7 5656 2.3566 0.5 95.1281 2.8146
2.7929 8.0 6464 2.1938 1.0 177.9741 3.7082
2.5748 9.0 7272 2.1051 1.0 172.6640 4.2122
2.4543 10.0 8080 2.0541 1.0 171.3369 4.6003
2.3596 11.0 8888 2.0055 1.0 173.2188 4.9057
2.2539 12.0 9696 1.9744 1.0 173.8705 5.1304
2.2171 13.0 10504 1.9486 1.0 172.1368 5.2480
2.1552 14.0 11312 1.9204 1.0 171.5559 5.5457
2.1167 15.0 12120 1.9034 1.0 174.3264 5.6359
2.0847 16.0 12928 1.8819 1.0 172.6913 5.8096
2.0243 17.0 13736 1.8685 1.0 171.9891 5.8948
1.9705 18.0 14544 1.8581 1.0 173.3638 5.9111
1.9269 19.0 15352 1.8497 1.0 174.2880 6.0771
1.889 20.0 16160 1.8420 1.0 172.6429 6.1777
1.8598 21.0 16968 1.8348 1.0 176.2967 6.2359
1.8036 22.0 17776 1.8234 1.0 179.8974 6.2606
1.7798 23.0 18584 1.8137 1.0 180.2397 6.3606
1.7423 24.0 19392 1.8092 1.0 181.6130 6.4088
1.7068 25.0 20200 1.8075 1.0 180.5901 6.4412
1.6844 26.0 21008 1.8093 1.0 181.9296 6.5005
1.6669 27.0 21816 1.8051 1.0 181.5158 6.4800
1.6396 28.0 22624 1.7990 1.0 180.2235 6.5655
1.6157 29.0 23432 1.8001 1.0 181.6333 6.5861
1.549 30.0 24240 1.8034 1.0 180.0172 6.6056
1.5684 31.0 25048 1.7994 1.0 179.4207 6.6328
1.5378 32.0 25856 1.8069 1.0 180.3731 6.5813

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
1.0B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/db020be9e4e35d83e65d19fab946b815

Base model

google/mt5-base
Finetuned
(288)
this model