454fe2aecdfbcb94533e58b551509430

This model is a fine-tuned version of google/umt5-xl on the Helsinki-NLP/opus_books [de-fr] dataset. It achieves the following results on the evaluation set:

Loss: 1.4542
Data Size: 1.0
Epoch Runtime: 449.1568
Bleu: 11.6309

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	4.9537	0	30.7562	2.3380
No log	1	872	3.3293	0.0078	33.8851	9.2379
No log	2	1744	2.5379	0.0156	41.7445	15.8179
0.0633	3	2616	2.1313	0.0312	52.0075	18.8487
0.1697	4	3488	1.8513	0.0625	69.7246	7.7503
2.2072	5	4360	1.6628	0.125	94.4012	8.2275
1.9298	6	5232	1.5691	0.25	147.4140	9.0057
1.7078	7	6104	1.4832	0.5	246.7156	9.7909
1.5714	8.0	6976	1.4034	1.0	451.6983	10.6251
1.393	9.0	7848	1.3718	1.0	448.7952	11.0573
1.279	10.0	8720	1.3670	1.0	447.1663	11.3449
1.1633	11.0	9592	1.3742	1.0	447.1207	11.5115
1.0298	12.0	10464	1.3857	1.0	445.6116	11.5748
0.9584	13.0	11336	1.4192	1.0	447.9599	11.6329
0.8319	14.0	12208	1.4542	1.0	449.1568	11.6309

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 1

Safetensors

Model size

0.9B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/454fe2aecdfbcb94533e58b551509430

Base model

google/umt5-xl

Finetuned

(33)

this model