f9d7db2b7ddca1778c918d3f359208f5

This model is a fine-tuned version of google-t5/t5-large on the Helsinki-NLP/opus_books [it-ru] dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8352
  • Data Size: 1.0
  • Epoch Runtime: 197.4582
  • Bleu: 13.1874

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 3.0883 0 14.1751 0.4827
No log 1 447 2.3676 0.0078 16.1659 1.0953
0.0413 2 894 1.9266 0.0156 17.9630 1.6517
0.0469 3 1341 1.7889 0.0312 21.8603 1.7657
0.0739 4 1788 1.6813 0.0625 27.1166 2.2834
0.1276 5 2235 1.5740 0.125 39.6788 3.2473
1.6784 6 2682 1.4593 0.25 63.4252 4.4493
1.5029 7 3129 1.3210 0.5 105.9890 5.5874
1.2876 8.0 3576 1.1613 1.0 193.8574 7.4531
1.1839 9.0 4023 1.0697 1.0 193.1934 8.5501
1.0911 10.0 4470 1.0156 1.0 197.1107 9.3000
1.0205 11.0 4917 0.9645 1.0 190.9796 9.7673
0.9745 12.0 5364 0.9334 1.0 189.2375 10.1733
0.9031 13.0 5811 0.9056 1.0 194.9996 10.8554
0.8574 14.0 6258 0.8889 1.0 187.1327 11.1684
0.825 15.0 6705 0.8677 1.0 197.2455 11.4799
0.7736 16.0 7152 0.8537 1.0 200.4439 11.8776
0.7444 17.0 7599 0.8436 1.0 196.0900 11.9945
0.7215 18.0 8046 0.8334 1.0 197.2098 12.1491
0.6925 19.0 8493 0.8372 1.0 206.5470 12.3215
0.6551 20.0 8940 0.8323 1.0 197.6972 12.6482
0.6515 21.0 9387 0.8252 1.0 195.9774 12.6647
0.5968 22.0 9834 0.8207 1.0 203.3854 12.7550
0.5777 23.0 10281 0.8197 1.0 201.8217 12.9205
0.5442 24.0 10728 0.8301 1.0 199.0957 12.8124
0.5252 25.0 11175 0.8256 1.0 198.2719 13.0060
0.5003 26.0 11622 0.8358 1.0 200.2850 13.0495
0.5018 27.0 12069 0.8352 1.0 197.4582 13.1874

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
0.8B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/f9d7db2b7ddca1778c918d3f359208f5

Finetuned
(171)
this model