674ba5b183236fdd2cf7966ac4fe5b3f

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [fi-pl] dataset. It achieves the following results on the evaluation set:

  • Loss: 3.0699
  • Data Size: 1.0
  • Epoch Runtime: 19.6051
  • Bleu: 1.9246

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 13.4463 0 2.0460 0.2623
No log 1 70 13.4276 0.0078 2.3714 0.2764
No log 2 140 13.4302 0.0156 3.0352 0.2400
No log 3 210 13.3020 0.0312 4.9146 0.2416
No log 4 280 13.0772 0.0625 5.9496 0.2442
No log 5 350 12.9450 0.125 8.7094 0.3009
No log 6 420 12.7245 0.25 11.1272 0.2844
2.6786 7 490 11.5403 0.5 14.2430 0.2935
13.5007 8.0 560 9.3822 1.0 21.9457 0.3628
11.0656 9.0 630 8.6960 1.0 19.5970 0.4271
8.5562 10.0 700 7.1361 1.0 19.0858 0.4143
6.8672 11.0 770 4.3781 1.0 19.6462 0.4305
5.5247 12.0 840 3.8241 1.0 19.9845 0.5213
4.7052 13.0 910 3.5930 1.0 18.4228 0.7018
4.4858 14.0 980 3.4864 1.0 20.1682 0.8099
4.2647 15.0 1050 3.4047 1.0 20.1060 0.8557
4.0957 16.0 1120 3.3180 1.0 20.0729 1.0107
4.0305 17.0 1190 3.2709 1.0 21.0513 1.0446
3.8461 18.0 1260 3.2278 1.0 18.5623 1.1807
3.7684 19.0 1330 3.1887 1.0 19.2361 1.3692
3.6076 20.0 1400 3.1652 1.0 19.8112 1.3995
3.5077 21.0 1470 3.1518 1.0 19.9357 1.5036
3.4783 22.0 1540 3.1240 1.0 18.8514 1.5004
3.3508 23.0 1610 3.1093 1.0 19.0167 1.7224
3.305 24.0 1680 3.0961 1.0 19.7953 1.6773
3.225 25.0 1750 3.0863 1.0 20.9737 1.8081
3.148 26.0 1820 3.0770 1.0 21.8481 1.8188
3.1231 27.0 1890 3.0741 1.0 18.5452 1.8006
3.0251 28.0 1960 3.0686 1.0 19.7075 1.7927
3.0045 29.0 2030 3.0663 1.0 19.9386 1.9342
2.9267 30.0 2100 3.0696 1.0 20.1962 1.8982
2.8796 31.0 2170 3.0672 1.0 20.7287 1.8818
2.8749 32.0 2240 3.0753 1.0 19.1134 1.9539
2.8033 33.0 2310 3.0699 1.0 19.6051 1.9246

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
1.0B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/674ba5b183236fdd2cf7966ac4fe5b3f

Base model

google/umt5-base
Finetuned
(49)
this model