ea1838e5ab58a6875f95363307574c71

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [fr-nl] dataset. It achieves the following results on the evaluation set:

Loss: 1.7244
Data Size: 1.0
Epoch Runtime: 227.0987
Bleu: 11.3478

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	11.8501	0	18.7319	0.2531
No log	1	1000	11.4087	0.0078	20.4448	0.2484
No log	2	2000	10.4134	0.0156	22.6671	0.2688
No log	3	3000	9.1732	0.0312	26.5250	0.3036
0.4887	4	4000	6.6378	0.0625	34.0548	0.3509
5.6661	5	5000	3.6067	0.125	46.7000	5.2304
0.2394	6	6000	2.7806	0.25	72.6359	4.5796
0.2916	7	7000	2.4648	0.5	124.0546	6.0043
2.8366	8.0	8000	2.2451	1.0	228.4455	7.1941
2.6234	9.0	9000	2.1357	1.0	227.8906	7.8593
2.474	10.0	10000	2.0542	1.0	227.9562	8.3096
2.383	11.0	11000	1.9929	1.0	231.0268	8.7678
2.2796	12.0	12000	1.9503	1.0	228.0446	9.0394
2.2007	13.0	13000	1.9234	1.0	229.5092	9.3699
2.121	14.0	14000	1.8852	1.0	235.1929	9.6404
2.0367	15.0	15000	1.8570	1.0	235.1429	9.8445
2.017	16.0	16000	1.8366	1.0	232.0961	9.9998
1.9158	17.0	17000	1.8234	1.0	230.5636	10.1654
1.9261	18.0	18000	1.8024	1.0	232.6199	10.2223
1.8678	19.0	19000	1.7901	1.0	231.7953	10.3140
1.8087	20.0	20000	1.7833	1.0	231.0188	10.4957
1.7884	21.0	21000	1.7631	1.0	235.1509	10.6346
1.7503	22.0	22000	1.7554	1.0	232.4435	10.6599
1.7083	23.0	23000	1.7517	1.0	231.4336	10.7470
1.662	24.0	24000	1.7433	1.0	232.1989	10.7994
1.675	25.0	25000	1.7371	1.0	232.1738	10.9347
1.6014	26.0	26000	1.7306	1.0	229.3544	11.0011
1.5773	27.0	27000	1.7357	1.0	231.6688	11.0416
1.5321	28.0	28000	1.7332	1.0	230.5063	11.0622
1.5308	29.0	29000	1.7259	1.0	229.9135	11.1018
1.48	30.0	30000	1.7305	1.0	231.0126	11.0951
1.4994	31.0	31000	1.7169	1.0	229.2552	11.2487
1.4482	32.0	32000	1.7227	1.0	231.7054	11.1848
1.4089	33.0	33000	1.7237	1.0	228.0267	11.2684
1.3907	34.0	34000	1.7234	1.0	227.4852	11.2950
1.3779	35.0	35000	1.7244	1.0	227.0987	11.3478

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: -

Safetensors

Model size

1.0B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/ea1838e5ab58a6875f95363307574c71

Base model

google/umt5-base

Finetuned

(49)

this model