eb96ab762c7ec109e9a637d60f7f8908

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [it-pt] dataset. It achieves the following results on the evaluation set:

Loss: 2.5571
Data Size: 1.0
Epoch Runtime: 12.1514
Bleu: 7.4817

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	11.6380	0	1.3179	0.6688
No log	1	29	11.6224	0.0078	1.6882	0.7427
No log	2	58	11.5177	0.0156	2.2360	0.6516
No log	3	87	11.7234	0.0312	2.8411	0.7009
No log	4	116	11.3968	0.0625	3.4966	0.7370
No log	5	145	11.3497	0.125	3.9657	0.4774
1.7156	6	174	10.9542	0.25	5.5267	0.6996
1.7156	7	203	11.0380	0.5	7.5773	0.6731
1.7156	8.0	232	9.1387	1.0	11.6909	0.9107
9.7807	9.0	261	8.0497	1.0	11.8910	0.8087
9.7807	10.0	290	7.0794	1.0	13.0532	0.4845
10.6168	11.0	319	6.2391	1.0	10.1422	0.4629
10.6168	12.0	348	4.6299	1.0	10.3170	1.6175
7.3694	13.0	377	3.9073	1.0	10.8678	3.5145
5.0901	14.0	406	3.5140	1.0	12.0403	6.2229
5.0901	15.0	435	3.2423	1.0	12.1298	6.7949
4.3639	16.0	464	3.0647	1.0	12.8197	4.6863
4.3639	17.0	493	2.9499	1.0	13.1804	5.1550
3.9547	18.0	522	2.8692	1.0	9.7197	5.5632
3.6322	19.0	551	2.8151	1.0	9.4106	5.8590
3.6322	20.0	580	2.7593	1.0	10.5028	6.1529
3.3638	21.0	609	2.7053	1.0	10.6758	6.3736
3.3638	22.0	638	2.6837	1.0	11.2459	6.4486
3.1785	23.0	667	2.6556	1.0	11.2236	6.4821
3.1785	24.0	696	2.6270	1.0	11.6886	6.6401
3.0074	25.0	725	2.6063	1.0	11.7180	6.7628
2.8686	26.0	754	2.5905	1.0	9.8356	6.8415
2.8686	27.0	783	2.5792	1.0	10.1183	6.9304
2.7418	28.0	812	2.5736	1.0	10.2673	7.0236
2.7418	29.0	841	2.5536	1.0	10.7104	7.0489
2.6142	30.0	870	2.5487	1.0	11.2792	7.1639
2.6142	31.0	899	2.5396	1.0	11.2350	7.1134
2.5155	32.0	928	2.5422	1.0	11.5853	7.2230
2.4294	33.0	957	2.5336	1.0	12.6890	7.2554
2.4294	34.0	986	2.5357	1.0	9.4938	7.3031
2.332	35.0	1015	2.5287	1.0	9.4530	7.3458
2.332	36.0	1044	2.5396	1.0	9.9192	7.4489
2.2852	37.0	1073	2.5445	1.0	10.7865	7.6072
2.1983	38.0	1102	2.5372	1.0	11.6134	7.4945
2.1983	39.0	1131	2.5571	1.0	12.1514	7.4817

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: -

Safetensors

Model size

1.0B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/eb96ab762c7ec109e9a637d60f7f8908

Base model

google/umt5-base

Finetuned

(49)

this model