ea05fa2eaa513be6b8616681bf1f4b18

This model is a fine-tuned version of google/umt5-small on the Helsinki-NLP/opus_books [it-pt] dataset. It achieves the following results on the evaluation set:

Loss: 3.0751
Data Size: 1.0
Epoch Runtime: 7.9126
Bleu: 5.0171

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	15.6078	0	1.1988	0.6907
No log	1	29	15.6218	0.0078	1.6269	0.6408
No log	2	58	15.5310	0.0156	1.5854	0.6959
No log	3	87	15.4838	0.0312	2.2227	0.8497
No log	4	116	15.2081	0.0625	3.0132	0.8471
No log	5	145	14.8742	0.125	3.2363	0.7166
1.8452	6	174	14.2933	0.25	3.6817	0.7972
1.8452	7	203	13.6700	0.5	4.7085	1.0098
1.8452	8.0	232	11.5518	1.0	7.0843	0.8746
9.9364	9.0	261	9.3013	1.0	6.5391	1.0598
9.9364	10.0	290	8.0109	1.0	7.1036	0.6155
11.2092	11.0	319	7.1456	1.0	7.0805	0.2939
11.2092	12.0	348	6.3537	1.0	7.0161	0.4840
8.9495	13.0	377	5.5768	1.0	6.9226	0.8585
7.4992	14.0	406	4.8268	1.0	7.7913	1.7003
7.4992	15.0	435	4.5998	1.0	6.3330	1.8951
6.569	16.0	464	4.4416	1.0	6.2419	3.5129
6.569	17.0	493	4.2977	1.0	6.3002	4.0755
5.9834	18.0	522	4.1932	1.0	6.2328	4.2095
5.577	19.0	551	4.1049	1.0	6.6787	4.8945
5.577	20.0	580	3.9961	1.0	7.1591	5.5966
5.2478	21.0	609	3.9227	1.0	6.9716	5.5090
5.2478	22.0	638	3.8466	1.0	6.9762	2.8500
5.0506	23.0	667	3.7808	1.0	7.3405	1.9223
5.0506	24.0	696	3.7155	1.0	7.3591	1.9233
4.8288	25.0	725	3.6592	1.0	7.5057	1.9852
4.6994	26.0	754	3.6129	1.0	7.5295	2.0804
4.6994	27.0	783	3.5576	1.0	8.2205	2.1495
4.5215	28.0	812	3.5128	1.0	8.2718	1.8446
4.5215	29.0	841	3.4721	1.0	5.9138	1.4464
4.3748	30.0	870	3.4289	1.0	6.4400	1.4866
4.3748	31.0	899	3.3928	1.0	6.4747	1.5312
4.2797	32.0	928	3.3596	1.0	6.4966	1.6077
4.1603	33.0	957	3.3198	1.0	6.9912	3.7955
4.1603	34.0	986	3.2889	1.0	6.8828	8.8411
4.0625	35.0	1015	3.2699	1.0	6.8684	9.3503
4.0625	36.0	1044	3.2435	1.0	6.8756	9.8702
3.9918	37.0	1073	3.2256	1.0	8.1790	9.7596
3.8825	38.0	1102	3.2000	1.0	7.6479	6.8208
3.8825	39.0	1131	3.1885	1.0	7.6541	5.3339
3.8346	40.0	1160	3.1632	1.0	7.7772	4.7116
3.8346	41.0	1189	3.1545	1.0	8.2865	4.8104
3.7586	42.0	1218	3.1442	1.0	8.4696	4.8509
3.7586	43.0	1247	3.1320	1.0	8.7876	4.8334
3.7034	44.0	1276	3.1273	1.0	6.1118	4.7618
3.644	45.0	1305	3.1205	1.0	6.5133	4.8357
3.644	46.0	1334	3.1047	1.0	6.8835	4.8843
3.5889	47.0	1363	3.0944	1.0	6.8275	4.9592
3.5889	48.0	1392	3.0843	1.0	6.7748	5.0365
3.5501	49.0	1421	3.0765	1.0	7.1142	5.0577
3.4686	50.0	1450	3.0751	1.0	7.9126	5.0171

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: -

Safetensors

Model size

0.6B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/ea05fa2eaa513be6b8616681bf1f4b18

Base model

google/umt5-small

Finetuned

(45)

this model