f64ecc4aeb51b053cecae47d815c724a

This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-ru on the Helsinki-NLP/opus_books [it-nl] dataset. It achieves the following results on the evaluation set:

  • Loss: 2.6936
  • Data Size: 1.0
  • Epoch Runtime: 4.7638
  • Bleu: 0.6405

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 8.4134 0 0.9973 0.0133
No log 1 58 7.3532 0.0078 1.6767 0.0499
No log 2 116 6.8972 0.0156 1.2262 0.1500
No log 3 174 6.4281 0.0312 1.2670 0.1453
No log 4 232 5.8248 0.0625 1.5846 0.0648
No log 5 290 5.0911 0.125 1.7803 0.1366
0.4997 6 348 4.4645 0.25 2.3705 0.0906
0.6531 7 406 3.9468 0.5 3.2073 0.1052
2.883 8.0 464 3.5137 1.0 5.1638 0.1142
3.611 9.0 522 3.2992 1.0 4.9019 0.1352
3.3946 10.0 580 3.1590 1.0 4.4863 0.2642
3.2628 11.0 638 3.0634 1.0 4.5552 0.1885
3.1405 12.0 696 2.9958 1.0 4.7116 0.3218
2.9724 13.0 754 2.9324 1.0 4.8064 0.3044
2.8922 14.0 812 2.8916 1.0 4.6985 0.3649
2.8167 15.0 870 2.8495 1.0 4.8551 0.3981
2.7602 16.0 928 2.8170 1.0 5.1181 0.4298
2.712 17.0 986 2.7919 1.0 5.2093 0.4582
2.6443 18.0 1044 2.7571 1.0 5.0125 0.5180
2.5507 19.0 1102 2.7485 1.0 4.9770 0.5086
2.4848 20.0 1160 2.7351 1.0 5.3925 0.5671
2.4457 21.0 1218 2.7130 1.0 5.2506 0.5826
2.398 22.0 1276 2.7127 1.0 4.9315 0.6359
2.3506 23.0 1334 2.7075 1.0 4.8651 0.5891
2.3122 24.0 1392 2.6896 1.0 4.6772 0.6432
2.2335 25.0 1450 2.7065 1.0 4.7600 0.6275
2.1836 26.0 1508 2.6983 1.0 4.8372 0.6173
2.127 27.0 1566 2.6906 1.0 4.6869 0.6627
2.0998 28.0 1624 2.6936 1.0 4.7638 0.6405

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
1
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/f64ecc4aeb51b053cecae47d815c724a

Finetuned
(41)
this model