0e61d7a8bf16a3497486736747a532fa

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [en-fr] dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3967
  • Data Size: 1.0
  • Epoch Runtime: 730.0780
  • Bleu: 13.8607

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 11.0050 0 57.2280 0.0913
No log 1 3177 9.7755 0.0078 62.6163 0.0603
0.2428 2 6354 6.4851 0.0156 69.6654 0.4655
6.1305 3 9531 3.7118 0.0312 80.4218 4.9143
3.8491 4 12708 2.4897 0.0625 100.3120 7.8902
3.0563 5 15885 2.2146 0.125 143.0110 6.6310
2.6465 6 19062 2.0177 0.25 226.6017 7.7631
2.3886 7 22239 1.8728 0.5 394.2835 8.8893
2.1613 8.0 25416 1.7347 1.0 725.5487 10.0712
1.9867 9.0 28593 1.6550 1.0 725.4036 10.7525
1.9215 10.0 31770 1.6002 1.0 728.3040 11.1860
1.8169 11.0 34947 1.5578 1.0 730.4529 11.5433
1.7411 12.0 38124 1.5310 1.0 729.5908 11.8498
1.6666 13.0 41301 1.5063 1.0 726.8605 12.0989
1.6302 14.0 44478 1.4838 1.0 726.3137 12.2911
1.5987 15.0 47655 1.4699 1.0 723.5461 12.4585
1.52 16.0 50832 1.4574 1.0 723.2799 12.6016
1.5011 17.0 54009 1.4438 1.0 723.0908 12.7511
1.4794 18.0 57186 1.4335 1.0 723.4092 12.8899
1.4367 19.0 60363 1.4250 1.0 722.8614 12.9570
1.3848 20.0 63540 1.4155 1.0 724.5252 13.0640
1.3722 21.0 66717 1.4153 1.0 728.9433 13.1557
1.3477 22.0 69894 1.4033 1.0 729.3175 13.2482
1.3196 23.0 73071 1.4030 1.0 723.1824 13.3230
1.3292 24.0 76248 1.3959 1.0 729.4897 13.3777
1.2922 25.0 79425 1.3921 1.0 726.9376 13.4284
1.2686 26.0 82602 1.3853 1.0 729.3467 13.5032
1.2393 27.0 85779 1.3907 1.0 726.2296 13.5276
1.2307 28.0 88956 1.3850 1.0 727.6741 13.6117
1.2041 29.0 92133 1.3881 1.0 726.9224 13.6822
1.1862 30.0 95310 1.3891 1.0 726.4434 13.6620
1.1582 31.0 98487 1.3991 1.0 726.7570 13.7280
1.1476 32.0 101664 1.3815 1.0 727.6579 13.7293
1.1168 33.0 104841 1.3921 1.0 725.2406 13.7680
1.1249 34.0 108018 1.3928 1.0 725.8319 13.8611
1.0888 35.0 111195 1.3979 1.0 728.4666 13.8371
1.0546 36.0 114372 1.3967 1.0 730.0780 13.8607

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
1.0B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/0e61d7a8bf16a3497486736747a532fa

Base model

google/umt5-base
Finetuned
(49)
this model