ea1838e5ab58a6875f95363307574c71

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [fr-nl] dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7244
  • Data Size: 1.0
  • Epoch Runtime: 227.0987
  • Bleu: 11.3478

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 11.8501 0 18.7319 0.2531
No log 1 1000 11.4087 0.0078 20.4448 0.2484
No log 2 2000 10.4134 0.0156 22.6671 0.2688
No log 3 3000 9.1732 0.0312 26.5250 0.3036
0.4887 4 4000 6.6378 0.0625 34.0548 0.3509
5.6661 5 5000 3.6067 0.125 46.7000 5.2304
0.2394 6 6000 2.7806 0.25 72.6359 4.5796
0.2916 7 7000 2.4648 0.5 124.0546 6.0043
2.8366 8.0 8000 2.2451 1.0 228.4455 7.1941
2.6234 9.0 9000 2.1357 1.0 227.8906 7.8593
2.474 10.0 10000 2.0542 1.0 227.9562 8.3096
2.383 11.0 11000 1.9929 1.0 231.0268 8.7678
2.2796 12.0 12000 1.9503 1.0 228.0446 9.0394
2.2007 13.0 13000 1.9234 1.0 229.5092 9.3699
2.121 14.0 14000 1.8852 1.0 235.1929 9.6404
2.0367 15.0 15000 1.8570 1.0 235.1429 9.8445
2.017 16.0 16000 1.8366 1.0 232.0961 9.9998
1.9158 17.0 17000 1.8234 1.0 230.5636 10.1654
1.9261 18.0 18000 1.8024 1.0 232.6199 10.2223
1.8678 19.0 19000 1.7901 1.0 231.7953 10.3140
1.8087 20.0 20000 1.7833 1.0 231.0188 10.4957
1.7884 21.0 21000 1.7631 1.0 235.1509 10.6346
1.7503 22.0 22000 1.7554 1.0 232.4435 10.6599
1.7083 23.0 23000 1.7517 1.0 231.4336 10.7470
1.662 24.0 24000 1.7433 1.0 232.1989 10.7994
1.675 25.0 25000 1.7371 1.0 232.1738 10.9347
1.6014 26.0 26000 1.7306 1.0 229.3544 11.0011
1.5773 27.0 27000 1.7357 1.0 231.6688 11.0416
1.5321 28.0 28000 1.7332 1.0 230.5063 11.0622
1.5308 29.0 29000 1.7259 1.0 229.9135 11.1018
1.48 30.0 30000 1.7305 1.0 231.0126 11.0951
1.4994 31.0 31000 1.7169 1.0 229.2552 11.2487
1.4482 32.0 32000 1.7227 1.0 231.7054 11.1848
1.4089 33.0 33000 1.7237 1.0 228.0267 11.2684
1.3907 34.0 34000 1.7234 1.0 227.4852 11.2950
1.3779 35.0 35000 1.7244 1.0 227.0987 11.3478

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
1.0B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/ea1838e5ab58a6875f95363307574c71

Base model

google/umt5-base
Finetuned
(49)
this model