674ba5b183236fdd2cf7966ac4fe5b3f

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [fi-pl] dataset. It achieves the following results on the evaluation set:

Loss: 3.0699
Data Size: 1.0
Epoch Runtime: 19.6051
Bleu: 1.9246

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	13.4463	0	2.0460	0.2623
No log	1	70	13.4276	0.0078	2.3714	0.2764
No log	2	140	13.4302	0.0156	3.0352	0.2400
No log	3	210	13.3020	0.0312	4.9146	0.2416
No log	4	280	13.0772	0.0625	5.9496	0.2442
No log	5	350	12.9450	0.125	8.7094	0.3009
No log	6	420	12.7245	0.25	11.1272	0.2844
2.6786	7	490	11.5403	0.5	14.2430	0.2935
13.5007	8.0	560	9.3822	1.0	21.9457	0.3628
11.0656	9.0	630	8.6960	1.0	19.5970	0.4271
8.5562	10.0	700	7.1361	1.0	19.0858	0.4143
6.8672	11.0	770	4.3781	1.0	19.6462	0.4305
5.5247	12.0	840	3.8241	1.0	19.9845	0.5213
4.7052	13.0	910	3.5930	1.0	18.4228	0.7018
4.4858	14.0	980	3.4864	1.0	20.1682	0.8099
4.2647	15.0	1050	3.4047	1.0	20.1060	0.8557
4.0957	16.0	1120	3.3180	1.0	20.0729	1.0107
4.0305	17.0	1190	3.2709	1.0	21.0513	1.0446
3.8461	18.0	1260	3.2278	1.0	18.5623	1.1807
3.7684	19.0	1330	3.1887	1.0	19.2361	1.3692
3.6076	20.0	1400	3.1652	1.0	19.8112	1.3995
3.5077	21.0	1470	3.1518	1.0	19.9357	1.5036
3.4783	22.0	1540	3.1240	1.0	18.8514	1.5004
3.3508	23.0	1610	3.1093	1.0	19.0167	1.7224
3.305	24.0	1680	3.0961	1.0	19.7953	1.6773
3.225	25.0	1750	3.0863	1.0	20.9737	1.8081
3.148	26.0	1820	3.0770	1.0	21.8481	1.8188
3.1231	27.0	1890	3.0741	1.0	18.5452	1.8006
3.0251	28.0	1960	3.0686	1.0	19.7075	1.7927
3.0045	29.0	2030	3.0663	1.0	19.9386	1.9342
2.9267	30.0	2100	3.0696	1.0	20.1962	1.8982
2.8796	31.0	2170	3.0672	1.0	20.7287	1.8818
2.8749	32.0	2240	3.0753	1.0	19.1134	1.9539
2.8033	33.0	2310	3.0699	1.0	19.6051	1.9246

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 1

Safetensors

Model size

1.0B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/674ba5b183236fdd2cf7966ac4fe5b3f

Base model

google/umt5-base

Finetuned

(49)

this model