5c42053dfa296a9c5ecfd99d78a7b4cc

This model is a fine-tuned version of google/mt5-xl on the Helsinki-NLP/opus_books [en-ru] dataset. It achieves the following results on the evaluation set:

Loss: 1.2776
Data Size: 1.0
Epoch Runtime: 207.5525
Bleu: 10.8996

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	8.0564	0	13.7463	0.0179
No log	1	437	3.6424	0.0078	15.4926	0.2600
No log	2	874	2.8113	0.0156	21.7286	0.7354
No log	3	1311	2.2161	0.0312	30.2459	1.0256
No log	4	1748	1.6569	0.0625	41.1657	6.3297
1.9696	5	2185	1.4800	0.125	51.3358	7.2365
1.7689	6	2622	1.3808	0.25	72.2639	8.3262
1.5617	7	3059	1.2818	0.5	118.2227	10.1999
1.3373	8.0	3496	1.2141	1.0	212.9637	10.8167
1.172	9.0	3933	1.1918	1.0	208.4175	11.0241
1.0123	10.0	4370	1.1997	1.0	208.0948	11.0342
0.8919	11.0	4807	1.2129	1.0	211.5544	11.2079
0.7836	12.0	5244	1.2379	1.0	211.5956	11.0471
0.6725	13.0	5681	1.2776	1.0	207.5525	10.8996

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 1

Safetensors

Model size

0.9B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/5c42053dfa296a9c5ecfd99d78a7b4cc

Base model

google/mt5-xl

Finetuned

(41)

this model