0e61d7a8bf16a3497486736747a532fa

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [en-fr] dataset. It achieves the following results on the evaluation set:

Loss: 1.3967
Data Size: 1.0
Epoch Runtime: 730.0780
Bleu: 13.8607

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	11.0050	0	57.2280	0.0913
No log	1	3177	9.7755	0.0078	62.6163	0.0603
0.2428	2	6354	6.4851	0.0156	69.6654	0.4655
6.1305	3	9531	3.7118	0.0312	80.4218	4.9143
3.8491	4	12708	2.4897	0.0625	100.3120	7.8902
3.0563	5	15885	2.2146	0.125	143.0110	6.6310
2.6465	6	19062	2.0177	0.25	226.6017	7.7631
2.3886	7	22239	1.8728	0.5	394.2835	8.8893
2.1613	8.0	25416	1.7347	1.0	725.5487	10.0712
1.9867	9.0	28593	1.6550	1.0	725.4036	10.7525
1.9215	10.0	31770	1.6002	1.0	728.3040	11.1860
1.8169	11.0	34947	1.5578	1.0	730.4529	11.5433
1.7411	12.0	38124	1.5310	1.0	729.5908	11.8498
1.6666	13.0	41301	1.5063	1.0	726.8605	12.0989
1.6302	14.0	44478	1.4838	1.0	726.3137	12.2911
1.5987	15.0	47655	1.4699	1.0	723.5461	12.4585
1.52	16.0	50832	1.4574	1.0	723.2799	12.6016
1.5011	17.0	54009	1.4438	1.0	723.0908	12.7511
1.4794	18.0	57186	1.4335	1.0	723.4092	12.8899
1.4367	19.0	60363	1.4250	1.0	722.8614	12.9570
1.3848	20.0	63540	1.4155	1.0	724.5252	13.0640
1.3722	21.0	66717	1.4153	1.0	728.9433	13.1557
1.3477	22.0	69894	1.4033	1.0	729.3175	13.2482
1.3196	23.0	73071	1.4030	1.0	723.1824	13.3230
1.3292	24.0	76248	1.3959	1.0	729.4897	13.3777
1.2922	25.0	79425	1.3921	1.0	726.9376	13.4284
1.2686	26.0	82602	1.3853	1.0	729.3467	13.5032
1.2393	27.0	85779	1.3907	1.0	726.2296	13.5276
1.2307	28.0	88956	1.3850	1.0	727.6741	13.6117
1.2041	29.0	92133	1.3881	1.0	726.9224	13.6822
1.1862	30.0	95310	1.3891	1.0	726.4434	13.6620
1.1582	31.0	98487	1.3991	1.0	726.7570	13.7280
1.1476	32.0	101664	1.3815	1.0	727.6579	13.7293
1.1168	33.0	104841	1.3921	1.0	725.2406	13.7680
1.1249	34.0	108018	1.3928	1.0	725.8319	13.8611
1.0888	35.0	111195	1.3979	1.0	728.4666	13.8371
1.0546	36.0	114372	1.3967	1.0	730.0780	13.8607

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: -

Safetensors

Model size

1.0B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/0e61d7a8bf16a3497486736747a532fa

Base model

google/umt5-base

Finetuned

(49)

this model