8dd8934fb108f95a911b5e26447ebb59

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [es-it] dataset. It achieves the following results on the evaluation set:

  • Loss: 2.4803
  • Data Size: 1.0
  • Epoch Runtime: 166.0883
  • Bleu: 5.5138

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 11.8845 0 13.7090 0.2804
No log 1 721 11.6500 0.0078 16.0695 0.2535
No log 2 1442 11.2228 0.0156 17.5692 0.2833
0.3237 3 2163 10.3332 0.0312 21.3931 0.2913
1.0012 4 2884 8.5047 0.0625 26.5733 0.3455
8.5143 5 3605 4.9032 0.125 36.3316 1.2113
4.7327 6 4326 3.3371 0.25 53.9259 2.4238
3.8862 7 5047 3.0048 0.5 92.5966 3.2504
3.496 8.0 5768 2.8315 1.0 167.7369 3.7882
3.2634 9.0 6489 2.7450 1.0 167.3899 4.1156
3.1612 10.0 7210 2.6941 1.0 169.2978 4.3139
3.0244 11.0 7931 2.6593 1.0 167.3909 4.4523
2.9753 12.0 8652 2.6200 1.0 167.1947 4.6066
2.9187 13.0 9373 2.6001 1.0 167.9997 4.7045
2.8032 14.0 10094 2.5692 1.0 166.7811 4.8149
2.7953 15.0 10815 2.5495 1.0 167.2644 4.8632
2.7074 16.0 11536 2.5413 1.0 171.8496 4.9945
2.6703 17.0 12257 2.5240 1.0 169.8680 5.0501
2.647 18.0 12978 2.5165 1.0 168.5366 5.1315
2.587 19.0 13699 2.5175 1.0 168.9415 5.1387
2.5565 20.0 14420 2.4971 1.0 168.2197 5.2374
2.5245 21.0 15141 2.4939 1.0 167.6366 5.2264
2.4804 22.0 15862 2.4823 1.0 167.6528 5.2776
2.3955 23.0 16583 2.4925 1.0 170.3214 5.3348
2.3861 24.0 17304 2.4809 1.0 169.3975 5.3402
2.3862 25.0 18025 2.4828 1.0 168.1509 5.3585
2.3627 26.0 18746 2.4789 1.0 167.9541 5.4241
2.3147 27.0 19467 2.4747 1.0 168.6779 5.4486
2.2717 28.0 20188 2.4815 1.0 169.7344 5.4859
2.2451 29.0 20909 2.4765 1.0 168.6457 5.4487
2.2161 30.0 21630 2.4774 1.0 166.5616 5.5141
2.2003 31.0 22351 2.4803 1.0 166.0883 5.5138

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
1.0B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/8dd8934fb108f95a911b5e26447ebb59

Base model

google/umt5-base
Finetuned
(49)
this model