5161859226dc3ff6ffdf0fbcb1427ed5

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [es-no] dataset. It achieves the following results on the evaluation set:

  • Loss: 2.6158
  • Data Size: 1.0
  • Epoch Runtime: 25.0782
  • Bleu: 5.5614

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 13.4736 0 2.4774 0.0490
No log 1 89 13.4180 0.0078 3.0967 0.0449
No log 2 178 13.4598 0.0156 3.4171 0.0455
No log 3 267 13.5142 0.0312 4.7365 0.0499
No log 4 356 12.4614 0.0625 6.2108 0.0586
No log 5 445 12.2575 0.125 7.7312 0.0415
1.2122 6 534 11.8622 0.25 11.3834 0.0697
5.6492 7 623 10.4287 0.5 15.5494 0.0763
9.7418 8.0 712 5.9765 1.0 25.2768 0.3937
5.6579 9.0 801 3.8992 1.0 23.8734 3.0899
5.0175 10.0 890 3.2819 1.0 25.0241 2.2056
4.2889 11.0 979 3.0612 1.0 26.2374 3.1157
3.8859 12.0 1068 2.9289 1.0 24.4179 3.5846
3.6268 13.0 1157 2.8582 1.0 24.0828 3.9279
3.531 14.0 1246 2.8099 1.0 24.2889 4.1136
3.3838 15.0 1335 2.7777 1.0 23.4939 4.3126
3.2203 16.0 1424 2.7359 1.0 24.5260 4.5416
3.1463 17.0 1513 2.7232 1.0 24.1557 4.4583
3.037 18.0 1602 2.6943 1.0 23.9128 4.6938
2.9699 19.0 1691 2.6611 1.0 23.5548 4.7951
2.897 20.0 1780 2.6484 1.0 23.8592 4.9567
2.7972 21.0 1869 2.6338 1.0 24.4856 5.1006
2.745 22.0 1958 2.6370 1.0 24.3166 5.2325
2.7041 23.0 2047 2.6280 1.0 23.9898 5.1784
2.6238 24.0 2136 2.6149 1.0 24.5929 5.3658
2.5732 25.0 2225 2.6170 1.0 24.2029 5.4342
2.5119 26.0 2314 2.6038 1.0 24.9785 5.3348
2.467 27.0 2403 2.6113 1.0 25.0814 5.4837
2.4478 28.0 2492 2.6164 1.0 23.2993 5.4099
2.3915 29.0 2581 2.5982 1.0 23.9281 5.5693
2.333 30.0 2670 2.6042 1.0 24.8486 5.5377
2.2747 31.0 2759 2.6044 1.0 26.2227 5.5426
2.2381 32.0 2848 2.6109 1.0 23.9335 5.5353
2.2022 33.0 2937 2.6158 1.0 25.0782 5.5614

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
1.0B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/5161859226dc3ff6ffdf0fbcb1427ed5

Base model

google/umt5-base
Finetuned
(48)
this model