419f3601ec85b04164b6910340ffaa49

This model is a fine-tuned version of google/umt5-xl on the Helsinki-NLP/opus_books [es-fr] dataset. It achieves the following results on the evaluation set:

Loss: 1.2225
Data Size: 1.0
Epoch Runtime: 708.3864
Bleu: 16.7532

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	4.3384	0	49.2721	5.2697
No log	1	1407	2.6215	0.0078	54.7574	19.5011
No log	2	2814	2.1143	0.0156	63.3221	25.9837
0.0715	3	4221	1.7672	0.0312	80.3591	17.9245
2.0077	4	5628	1.5123	0.0625	100.8485	12.3584
1.7482	5	7035	1.4152	0.125	136.8567	13.1926
1.5443	6	8442	1.3240	0.25	222.9028	14.2348
1.4485	7	9849	1.2461	0.5	382.5504	15.1229
1.2776	8.0	11256	1.1976	1.0	708.8908	16.2015
1.1374	9.0	12663	1.1629	1.0	706.9159	16.3826
1.0682	10.0	14070	1.1637	1.0	702.2565	16.6746
0.9317	11.0	15477	1.1766	1.0	705.4175	16.7103
0.8234	12.0	16884	1.1905	1.0	704.6685	16.7239
0.7706	13.0	18291	1.2225	1.0	708.3864	16.7532

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: -

Safetensors

Model size

0.9B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/419f3601ec85b04164b6910340ffaa49

Base model

google/umt5-xl

Finetuned

(33)

this model