520b14835eced208a78ef9e8f2f99d7a

This model is a fine-tuned version of google/umt5-small on the Helsinki-NLP/opus_books [fr-it] dataset. It achieves the following results on the evaluation set:

  • Loss: 2.7937
  • Data Size: 1.0
  • Epoch Runtime: 58.5370
  • Bleu: 5.6361

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 17.0743 0 5.5982 0.2047
No log 1 367 16.4893 0.0078 6.0438 0.2076
No log 2 734 15.6521 0.0156 6.8942 0.2252
No log 3 1101 14.1151 0.0312 8.2651 0.2289
No log 4 1468 11.6662 0.0625 9.6650 0.2161
0.7599 5 1835 8.2824 0.125 12.6107 0.2983
8.6202 6 2202 5.2964 0.25 19.1086 0.7094
6.2023 7 2569 4.3212 0.5 31.9172 2.9934
5.0974 8.0 2936 3.8220 1.0 57.6145 2.1554
4.6036 9.0 3303 3.5523 1.0 57.8272 2.6853
4.3248 10.0 3670 3.4210 1.0 59.7893 3.0457
4.1866 11.0 4037 3.3420 1.0 58.6285 3.2571
4.0657 12.0 4404 3.2866 1.0 57.6160 3.4933
4.0059 13.0 4771 3.2401 1.0 58.4404 3.6619
3.8718 14.0 5138 3.2015 1.0 57.9625 3.8115
3.8261 15.0 5505 3.1692 1.0 58.4455 3.9516
3.7285 16.0 5872 3.1352 1.0 59.0941 4.0884
3.6943 17.0 6239 3.1219 1.0 58.3599 4.1323
3.6541 18.0 6606 3.0931 1.0 59.2689 4.2474
3.6291 19.0 6973 3.0716 1.0 59.4495 4.3364
3.5636 20.0 7340 3.0412 1.0 58.0654 4.4187
3.5061 21.0 7707 3.0389 1.0 57.7992 4.4748
3.4734 22.0 8074 3.0219 1.0 57.9885 4.5529
3.4102 23.0 8441 3.0044 1.0 58.9918 4.6240
3.3814 24.0 8808 2.9803 1.0 59.7393 4.7050
3.3919 25.0 9175 2.9830 1.0 58.7152 4.7763
3.2983 26.0 9542 2.9674 1.0 58.7490 4.8029
3.2863 27.0 9909 2.9622 1.0 59.5547 4.8354
3.2594 28.0 10276 2.9418 1.0 59.9988 4.8734
3.2504 29.0 10643 2.9263 1.0 59.5368 4.9443
3.2417 30.0 11010 2.9301 1.0 58.6146 4.9528
3.1811 31.0 11377 2.9141 1.0 58.3594 5.0152
3.1766 32.0 11744 2.9059 1.0 58.9723 5.0389
3.1313 33.0 12111 2.8911 1.0 58.4011 5.0914
3.1139 34.0 12478 2.8956 1.0 59.8377 5.1396
3.0672 35.0 12845 2.8871 1.0 59.5639 5.1947
3.0823 36.0 13212 2.8757 1.0 58.4524 5.2140
3.049 37.0 13579 2.8646 1.0 58.3569 5.2484
3.0445 38.0 13946 2.8729 1.0 58.3784 5.3178
3.0336 39.0 14313 2.8476 1.0 59.2942 5.3525
2.9965 40.0 14680 2.8526 1.0 58.9237 5.3645
2.9969 41.0 15047 2.8385 1.0 58.7512 5.3852
2.9535 42.0 15414 2.8423 1.0 58.8361 5.4229
2.932 43.0 15781 2.8336 1.0 58.5391 5.4622
2.923 44.0 16148 2.8279 1.0 59.8299 5.4951
2.9285 45.0 16515 2.8244 1.0 59.5493 5.5138
2.9258 46.0 16882 2.8144 1.0 58.9045 5.5251
2.8831 47.0 17249 2.8164 1.0 59.3035 5.5434
2.8739 48.0 17616 2.8114 1.0 58.5114 5.6106
2.8258 49.0 17983 2.8138 1.0 60.7856 5.6095
2.8659 50.0 18350 2.7937 1.0 58.5370 5.6361

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/520b14835eced208a78ef9e8f2f99d7a

Base model

google/umt5-small
Finetuned
(45)
this model