dda0f36ebebba3f91ececf49fde6706a

This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books [it-ru] dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9009
  • Data Size: 1.0
  • Epoch Runtime: 111.4918
  • Bleu: 11.5587

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 3.3957 0 8.6733 0.4226
No log 1 447 2.9355 0.0078 10.7657 0.4165
0.0488 2 894 2.3432 0.0156 10.3163 0.4884
0.0573 3 1341 2.2212 0.0312 12.3303 0.8807
0.0905 4 1788 2.0793 0.0625 15.5077 0.8417
0.1563 5 2235 1.9609 0.125 21.8305 1.0791
2.0802 6 2682 1.8493 0.25 34.7633 2.0794
1.9014 7 3129 1.7160 0.5 59.9261 2.9115
1.7279 8.0 3576 1.5688 1.0 111.8558 3.9321
1.6258 9.0 4023 1.4644 1.0 106.2814 4.6714
1.544 10.0 4470 1.3874 1.0 112.3671 5.3961
1.4653 11.0 4917 1.3200 1.0 107.5847 5.8752
1.4083 12.0 5364 1.2686 1.0 109.5193 6.4212
1.342 13.0 5811 1.2232 1.0 116.3281 6.8981
1.2931 14.0 6258 1.1928 1.0 121.9684 7.1075
1.263 15.0 6705 1.1596 1.0 120.9998 7.5103
1.2135 16.0 7152 1.1288 1.0 105.8727 7.8518
1.1871 17.0 7599 1.1031 1.0 105.8427 7.9929
1.1637 18.0 8046 1.0857 1.0 111.1980 8.2713
1.1241 19.0 8493 1.0636 1.0 106.4765 8.5642
1.0953 20.0 8940 1.0499 1.0 107.8300 8.7800
1.0958 21.0 9387 1.0354 1.0 105.1607 8.8858
1.0353 22.0 9834 1.0202 1.0 106.5439 9.0909
1.0098 23.0 10281 1.0077 1.0 101.3376 9.2164
0.9925 24.0 10728 0.9981 1.0 108.5114 9.4437
0.9704 25.0 11175 0.9906 1.0 104.2813 9.6297
0.9426 26.0 11622 0.9761 1.0 106.7828 9.8129
0.9492 27.0 12069 0.9694 1.0 107.0763 9.9101
0.9029 28.0 12516 0.9657 1.0 105.2038 10.1075
0.8811 29.0 12963 0.9548 1.0 106.0641 10.0841
0.8678 30.0 13410 0.9499 1.0 110.9597 10.2664
0.8721 31.0 13857 0.9407 1.0 105.4958 10.3908
0.8381 32.0 14304 0.9407 1.0 102.9635 10.4487
0.8413 33.0 14751 0.9335 1.0 105.2890 10.4928
0.8143 34.0 15198 0.9322 1.0 104.6580 10.6494
0.7941 35.0 15645 0.9236 1.0 111.0841 10.7106
0.7828 36.0 16092 0.9175 1.0 100.8563 10.7893
0.7575 37.0 16539 0.9191 1.0 104.8849 10.9428
0.7695 38.0 16986 0.9181 1.0 106.6565 11.0279
0.744 39.0 17433 0.9089 1.0 109.3794 11.1432
0.7226 40.0 17880 0.9080 1.0 109.9170 11.1099
0.7161 41.0 18327 0.9049 1.0 103.7046 11.2632
0.698 42.0 18774 0.9077 1.0 102.1417 11.2277
0.701 43.0 19221 0.9150 1.0 106.4206 11.2808
0.6753 44.0 19668 0.9004 1.0 105.7527 11.3886
0.6702 45.0 20115 0.9061 1.0 108.0678 11.4461
0.6565 46.0 20562 0.9050 1.0 105.7159 11.5096
0.6572 47.0 21009 0.9049 1.0 106.0853 11.5223
0.6343 48.0 21456 0.9009 1.0 111.4918 11.5587

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/dda0f36ebebba3f91ececf49fde6706a

Finetuned
(731)
this model