c11299132795dddc922e3afc1c391c93

This model is a fine-tuned version of facebook/mbart-large-50-many-to-one-mmt on the Helsinki-NLP/opus_books [en-pt] dataset. It achieves the following results on the evaluation set:

  • Loss: 2.4782
  • Data Size: 1.0
  • Epoch Runtime: 14.1248
  • Bleu: 30.5773

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 6.7226 0 1.3572 0.8420
No log 1 35 6.0182 0.0078 1.7719 1.1671
No log 2 70 5.6072 0.0156 3.3959 1.3022
No log 3 105 5.0054 0.0312 4.6290 2.2814
No log 4 140 4.5388 0.0625 5.5968 2.9079
No log 5 175 4.0266 0.125 7.3799 3.7518
No log 6 210 3.4877 0.25 9.9524 5.0411
No log 7 245 2.8832 0.5 10.7647 7.5047
0.6501 8.0 280 2.4228 1.0 13.8585 11.3245
2.019 9.0 315 2.2550 1.0 13.4006 14.1053
1.2677 10.0 350 2.2912 1.0 13.3583 27.0282
1.2677 11.0 385 2.3061 1.0 14.4466 28.5896
0.7065 12.0 420 2.3487 1.0 13.4923 31.4395
0.4309 13.0 455 2.4782 1.0 14.1248 30.5773

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
6
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/c11299132795dddc922e3afc1c391c93

Finetuned
(42)
this model