d7ad1bfd26c4e2608b3d6b0cc6ba1930

This model is a fine-tuned version of google/umt5-xl on the Helsinki-NLP/opus_books [en-sv] dataset. It achieves the following results on the evaluation set:

Loss: 1.5493
Data Size: 1.0
Epoch Runtime: 51.1777
Bleu: 13.2911

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	5.4494	0	3.4266	1.9093
No log	1	77	4.5167	0.0078	3.8706	5.1044
No log	2	154	3.7604	0.0156	9.2201	9.0472
No log	3	231	3.3703	0.0312	14.6175	12.6550
No log	4	308	2.9100	0.0625	20.6412	17.2784
No log	5	385	2.5419	0.125	23.7592	20.3400
0.3469	6	462	2.2000	0.25	26.1015	23.1353
1.106	7	539	1.8657	0.5	40.6762	16.6875
2.0001	8.0	616	1.5413	1.0	61.8158	11.8273
1.6857	9.0	693	1.4785	1.0	52.0622	12.4630
1.3643	10.0	770	1.4711	1.0	51.4190	12.6882
1.2811	11.0	847	1.4771	1.0	56.7262	12.9381
1.0763	12.0	924	1.4806	1.0	50.7818	13.1336
0.9344	13.0	1001	1.5165	1.0	55.4850	13.3714
0.8311	14.0	1078	1.5493	1.0	51.1777	13.2911

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: -

Safetensors

Model size

0.9B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/d7ad1bfd26c4e2608b3d6b0cc6ba1930

Base model

google/umt5-xl

Finetuned

(33)

this model