f2a254ca87a674f770e3482b904a7336

This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-ru on the Helsinki-NLP/opus_books [fr-sv] dataset. It achieves the following results on the evaluation set:

  • Loss: 2.9077
  • Data Size: 1.0
  • Epoch Runtime: 5.7557
  • Bleu: 0.9303

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 7.9852 0 1.0313 0.0187
No log 1 75 7.0896 0.0078 1.8169 0.0216
No log 2 150 6.4901 0.0156 1.2675 0.0123
No log 3 225 6.0555 0.0312 1.4153 0.0123
No log 4 300 5.5600 0.0625 1.5753 0.0789
No log 5 375 4.9630 0.125 1.9565 0.0373
No log 6 450 4.4028 0.25 2.7072 0.0346
No log 7 525 3.9807 0.5 3.6940 0.0418
3.7798 8.0 600 3.6331 1.0 6.3021 0.1063
3.5666 9.0 675 3.4253 1.0 5.5565 0.2843
3.3259 10.0 750 3.3041 1.0 5.5804 0.3805
3.1983 11.0 825 3.2043 1.0 5.9144 0.4566
3.0369 12.0 900 3.1334 1.0 5.5680 0.5230
2.9113 13.0 975 3.0688 1.0 6.0732 0.5838
2.814 14.0 1050 3.0244 1.0 5.6112 0.6339
2.6864 15.0 1125 2.9779 1.0 5.9027 0.6517
2.5893 16.0 1200 2.9578 1.0 5.7051 0.6521
2.4946 17.0 1275 2.9415 1.0 5.9149 0.7250
2.4246 18.0 1350 2.9288 1.0 5.6919 0.7447
2.3156 19.0 1425 2.9326 1.0 5.8616 0.7507
2.2499 20.0 1500 2.9000 1.0 5.9752 0.7759
2.154 21.0 1575 2.9075 1.0 6.4975 0.8442
2.0921 22.0 1650 2.9200 1.0 5.8360 0.8208
1.9875 23.0 1725 2.9155 1.0 5.6037 0.9272
1.93 24.0 1800 2.9077 1.0 5.7557 0.9303

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
1
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/f2a254ca87a674f770e3482b904a7336

Finetuned
(41)
this model