2cbae2386b74a080f53eb5109ca5971c

This model is a fine-tuned version of google/mt5-base on the Helsinki-NLP/opus_books [es-nl] dataset. It achieves the following results on the evaluation set:

Loss: 1.8544
Data Size: 1.0
Epoch Runtime: 169.6302
Bleu: 7.9441

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	17.1342	0	14.2164	0.0123
No log	1	806	14.4190	0.0078	15.4260	0.0141
No log	2	1612	12.9927	0.0156	17.8722	0.0145
No log	3	2418	12.3336	0.0312	21.2802	0.0119
0.4603	4	3224	8.7273	0.0625	25.7241	0.0168
7.8757	5	4030	5.7522	0.125	36.1101	0.0238
3.8606	6	4836	2.8423	0.25	56.7123	2.2133
3.2748	7	5642	2.5475	0.5	95.4039	3.4680
2.9493	8.0	6448	2.3580	1.0	170.7454	4.3037
2.7582	9.0	7254	2.2663	1.0	169.8955	4.7458
2.6109	10.0	8060	2.1829	1.0	169.9373	5.2368
2.5575	11.0	8866	2.1357	1.0	169.9412	5.5100
2.445	12.0	9672	2.0929	1.0	169.8436	5.6948
2.3351	13.0	10478	2.0607	1.0	170.8743	6.0219
2.3008	14.0	11284	2.0226	1.0	170.3089	6.2488
2.2024	15.0	12090	1.9973	1.0	170.4874	6.4205
2.1786	16.0	12896	1.9712	1.0	169.2548	6.6095
2.1158	17.0	13702	1.9576	1.0	169.8372	6.7398
2.0339	18.0	14508	1.9481	1.0	170.9875	6.8626
2.0602	19.0	15314	1.9307	1.0	169.0190	7.0636
1.9852	20.0	16120	1.9111	1.0	168.9778	7.1152
1.9701	21.0	16926	1.9002	1.0	170.2868	7.1676
1.9284	22.0	17732	1.8983	1.0	170.0649	7.2960
1.8985	23.0	18538	1.8850	1.0	169.0035	7.2971
1.8702	24.0	19344	1.8737	1.0	169.1320	7.4994
1.8144	25.0	20150	1.8745	1.0	170.3661	7.5100
1.788	26.0	20956	1.8647	1.0	169.8877	7.5469
1.7573	27.0	21762	1.8660	1.0	178.3417	7.5739
1.7072	28.0	22568	1.8523	1.0	170.5982	7.6679
1.6876	29.0	23374	1.8576	1.0	170.2814	7.7006
1.6681	30.0	24180	1.8508	1.0	168.5706	7.7472
1.6361	31.0	24986	1.8524	1.0	169.2578	7.7534
1.6185	32.0	25792	1.8519	1.0	168.9454	7.7990
1.6048	33.0	26598	1.8484	1.0	171.0696	7.9047
1.5681	34.0	27404	1.8498	1.0	169.9789	7.9203
1.5462	35.0	28210	1.8516	1.0	168.9406	7.9122
1.5238	36.0	29016	1.8516	1.0	169.5784	7.9168
1.4893	37.0	29822	1.8544	1.0	169.6302	7.9441

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: -

Safetensors

Model size

1.0B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/2cbae2386b74a080f53eb5109ca5971c

Base model

google/mt5-base

Finetuned

(301)

this model