7ea07d0f69a1e5485200faa2f305f4f2

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [fr-sv] dataset. It achieves the following results on the evaluation set:

  • Loss: 2.8795
  • Data Size: 1.0
  • Epoch Runtime: 21.3988
  • Bleu: 4.4021

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 12.6891 0 2.1175 0.0119
No log 1 75 12.6997 0.0078 2.4211 0.0247
No log 2 150 12.4566 0.0156 3.0919 0.0122
No log 3 225 12.4297 0.0312 4.4245 0.0402
No log 4 300 11.7860 0.0625 5.3923 0.0366
No log 5 375 11.5196 0.125 7.3997 0.0333
No log 6 450 10.7898 0.25 9.9374 0.0289
No log 7 525 8.5665 0.5 13.8185 0.0768
8.4747 8.0 600 5.9402 1.0 22.1808 0.4226
6.5483 9.0 675 4.3072 1.0 20.1638 2.2010
5.0775 10.0 750 3.6728 1.0 20.1892 4.4350
4.6807 11.0 825 3.3535 1.0 21.4876 5.0813
4.2284 12.0 900 3.2363 1.0 20.5669 2.7329
4.0264 13.0 975 3.1474 1.0 20.7894 2.9965
3.8285 14.0 1050 3.0936 1.0 20.9474 3.1947
3.6863 15.0 1125 3.0358 1.0 20.6891 3.4564
3.5269 16.0 1200 3.0083 1.0 21.6643 3.4921
3.4742 17.0 1275 2.9808 1.0 20.0551 3.6284
3.382 18.0 1350 2.9654 1.0 20.7244 3.6758
3.2942 19.0 1425 2.9437 1.0 21.6724 3.8500
3.2333 20.0 1500 2.9292 1.0 21.4422 3.8753
3.1626 21.0 1575 2.9188 1.0 20.4945 3.8748
3.1052 22.0 1650 2.9139 1.0 21.0036 3.9785
3.0291 23.0 1725 2.9109 1.0 21.3788 4.0008
2.9656 24.0 1800 2.9018 1.0 21.3974 4.0299
2.9288 25.0 1875 2.8930 1.0 20.7319 4.1182
2.8649 26.0 1950 2.8929 1.0 20.2559 4.1961
2.8202 27.0 2025 2.8883 1.0 20.3763 4.2134
2.7626 28.0 2100 2.8825 1.0 20.6661 4.3136
2.6921 29.0 2175 2.8747 1.0 21.0836 4.2935
2.6788 30.0 2250 2.8861 1.0 22.7795 4.2313
2.6414 31.0 2325 2.8776 1.0 19.8240 4.3317
2.605 32.0 2400 2.8784 1.0 20.4296 4.3686
2.5408 33.0 2475 2.8795 1.0 21.3988 4.4021

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
1.0B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/7ea07d0f69a1e5485200faa2f305f4f2

Base model

google/umt5-base
Finetuned
(49)
this model