ed16a38772106315469467226072f32f

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [fr-no] dataset. It achieves the following results on the evaluation set:

Loss: 2.7416
Data Size: 1.0
Epoch Runtime: 22.5158
Bleu: 5.3790

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	13.5261	0	2.3767	0.0301
No log	1	86	13.2521	0.0078	2.6838	0.0336
No log	2	172	13.0109	0.0156	3.0705	0.0265
No log	3	258	12.7798	0.0312	4.1446	0.0388
No log	4	344	12.9848	0.0625	4.8077	0.0319
0.9739	5	430	12.8778	0.125	6.3842	0.0353
4.3608	6	516	11.6073	0.25	8.8062	0.0339
5.2262	7	602	9.4455	0.5	13.0506	0.1196
6.888	8.0	688	7.1112	1.0	23.3564	0.2888
7.0411	9.0	774	4.4224	1.0	21.4657	1.7103
4.9779	10.0	860	3.4176	1.0	21.5710	2.1436
4.5006	11.0	946	3.1422	1.0	22.5097	3.2023
3.9263	12.0	1032	3.0162	1.0	21.5020	3.7141
3.6779	13.0	1118	2.9435	1.0	23.1354	3.9488
3.497	14.0	1204	2.9015	1.0	21.8859	4.1796
3.378	15.0	1290	2.8740	1.0	22.8834	4.3399
3.2738	16.0	1376	2.8326	1.0	21.2591	4.5026
3.1725	17.0	1462	2.7996	1.0	21.8940	4.6117
3.0748	18.0	1548	2.7943	1.0	23.2602	4.6741
2.9964	19.0	1634	2.7696	1.0	24.2556	4.8760
2.9278	20.0	1720	2.7633	1.0	21.4158	4.9620
2.901	21.0	1806	2.7494	1.0	21.8547	5.0313
2.834	22.0	1892	2.7425	1.0	22.3119	5.1317
2.7446	23.0	1978	2.7432	1.0	23.6986	5.0833
2.7188	24.0	2064	2.7327	1.0	21.7825	5.0819
2.6431	25.0	2150	2.7295	1.0	22.2695	5.1180
2.6015	26.0	2236	2.7287	1.0	22.7781	5.1391
2.5468	27.0	2322	2.7378	1.0	24.2939	5.2243
2.5174	28.0	2408	2.7266	1.0	22.0823	5.2862
2.4738	29.0	2494	2.7292	1.0	21.9360	5.2958
2.4305	30.0	2580	2.7386	1.0	21.9431	5.3278
2.3935	31.0	2666	2.7362	1.0	22.0658	5.3596
2.3213	32.0	2752	2.7416	1.0	22.5158	5.3790

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 3

Safetensors

Model size

1.0B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/ed16a38772106315469467226072f32f

Base model

google/umt5-base

Finetuned

(49)

this model