5161859226dc3ff6ffdf0fbcb1427ed5

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [es-no] dataset. It achieves the following results on the evaluation set:

Loss: 2.6158
Data Size: 1.0
Epoch Runtime: 25.0782
Bleu: 5.5614

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	13.4736	0	2.4774	0.0490
No log	1	89	13.4180	0.0078	3.0967	0.0449
No log	2	178	13.4598	0.0156	3.4171	0.0455
No log	3	267	13.5142	0.0312	4.7365	0.0499
No log	4	356	12.4614	0.0625	6.2108	0.0586
No log	5	445	12.2575	0.125	7.7312	0.0415
1.2122	6	534	11.8622	0.25	11.3834	0.0697
5.6492	7	623	10.4287	0.5	15.5494	0.0763
9.7418	8.0	712	5.9765	1.0	25.2768	0.3937
5.6579	9.0	801	3.8992	1.0	23.8734	3.0899
5.0175	10.0	890	3.2819	1.0	25.0241	2.2056
4.2889	11.0	979	3.0612	1.0	26.2374	3.1157
3.8859	12.0	1068	2.9289	1.0	24.4179	3.5846
3.6268	13.0	1157	2.8582	1.0	24.0828	3.9279
3.531	14.0	1246	2.8099	1.0	24.2889	4.1136
3.3838	15.0	1335	2.7777	1.0	23.4939	4.3126
3.2203	16.0	1424	2.7359	1.0	24.5260	4.5416
3.1463	17.0	1513	2.7232	1.0	24.1557	4.4583
3.037	18.0	1602	2.6943	1.0	23.9128	4.6938
2.9699	19.0	1691	2.6611	1.0	23.5548	4.7951
2.897	20.0	1780	2.6484	1.0	23.8592	4.9567
2.7972	21.0	1869	2.6338	1.0	24.4856	5.1006
2.745	22.0	1958	2.6370	1.0	24.3166	5.2325
2.7041	23.0	2047	2.6280	1.0	23.9898	5.1784
2.6238	24.0	2136	2.6149	1.0	24.5929	5.3658
2.5732	25.0	2225	2.6170	1.0	24.2029	5.4342
2.5119	26.0	2314	2.6038	1.0	24.9785	5.3348
2.467	27.0	2403	2.6113	1.0	25.0814	5.4837
2.4478	28.0	2492	2.6164	1.0	23.2993	5.4099
2.3915	29.0	2581	2.5982	1.0	23.9281	5.5693
2.333	30.0	2670	2.6042	1.0	24.8486	5.5377
2.2747	31.0	2759	2.6044	1.0	26.2227	5.5426
2.2381	32.0	2848	2.6109	1.0	23.9335	5.5353
2.2022	33.0	2937	2.6158	1.0	25.0782	5.5614

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 2

Safetensors

Model size

1.0B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/5161859226dc3ff6ffdf0fbcb1427ed5

Base model

google/umt5-base

Finetuned

(49)

this model