0ee5a78fd2220aa7891c0075b50125fc

This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-ru on the Helsinki-NLP/opus_books [it-ru] dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6485
  • Data Size: 1.0
  • Epoch Runtime: 28.3177
  • Bleu: 13.4729

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 5.7206 0 2.6305 0.1139
No log 1 447 3.1931 0.0078 3.0803 0.0079
0.0589 2 894 2.5179 0.0156 3.1478 0.0922
0.0588 3 1341 2.0219 0.0312 3.5702 0.7033
0.0823 4 1788 1.7522 0.0625 4.2978 1.5766
0.1313 5 2235 1.5345 0.125 5.8991 2.4306
1.5614 6 2682 1.3262 0.25 9.0927 4.0661
1.2508 7 3129 1.1098 0.5 15.3096 5.5438
0.9889 8.0 3576 0.8984 1.0 28.0396 7.8445
0.8658 9.0 4023 0.7966 1.0 27.8408 9.1197
0.7701 10.0 4470 0.7366 1.0 27.6390 10.1494
0.6991 11.0 4917 0.6996 1.0 28.4780 10.8819
0.6441 12.0 5364 0.6771 1.0 27.7038 11.3733
0.5868 13.0 5811 0.6531 1.0 29.2186 11.8695
0.5479 14.0 6258 0.6410 1.0 28.7030 12.3612
0.519 15.0 6705 0.6351 1.0 27.7515 12.5152
0.4724 16.0 7152 0.6347 1.0 28.2174 12.6411
0.4515 17.0 7599 0.6260 1.0 28.1993 12.7953
0.4262 18.0 8046 0.6206 1.0 28.0471 13.1307
0.4082 19.0 8493 0.6346 1.0 27.2427 13.2125
0.3806 20.0 8940 0.6277 1.0 28.7176 13.2173
0.3685 21.0 9387 0.6454 1.0 28.9584 13.1375
0.3311 22.0 9834 0.6485 1.0 28.3177 13.4729

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/0ee5a78fd2220aa7891c0075b50125fc

Finetuned
(41)
this model