6fee263a31b18416fd4151e74b993362

This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-sv on the Helsinki-NLP/opus_books [de-es] dataset. It achieves the following results on the evaluation set:

Loss: 2.2600
Data Size: 1.0
Epoch Runtime: 42.1547
Bleu: 2.2023

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	7.4812	0	3.8187	0.0328
No log	1	688	6.1989	0.0078	4.4303	0.0743
No log	2	1376	5.5211	0.0156	4.5877	0.0608
No log	3	2064	4.9089	0.0312	5.2115	0.1089
0.1882	4	2752	4.3859	0.0625	6.3842	0.1624
0.3451	5	3440	3.9150	0.125	9.2596	0.2002
3.7252	6	4128	3.5108	0.25	13.5654	0.3747
3.3139	7	4816	3.1460	0.5	24.8852	0.5871
2.9137	8.0	5504	2.7991	1.0	42.6882	0.9214
2.6904	9.0	6192	2.6143	1.0	42.2792	1.1742
2.5177	10.0	6880	2.4953	1.0	42.3318	1.3955
2.3768	11.0	7568	2.4204	1.0	43.0626	1.4885
2.2647	12.0	8256	2.3584	1.0	42.9736	1.6630
2.1793	13.0	8944	2.3150	1.0	41.7735	1.7310
2.1043	14.0	9632	2.2773	1.0	42.4684	1.8424
2.0185	15.0	10320	2.2646	1.0	42.4202	1.9353
1.9288	16.0	11008	2.2381	1.0	43.9268	1.9716
1.8819	17.0	11696	2.2350	1.0	41.9756	2.0692
1.7779	18.0	12384	2.2240	1.0	41.1989	2.0879
1.7565	19.0	13072	2.2294	1.0	41.9376	2.1359
1.7176	20.0	13760	2.2233	1.0	43.5326	2.1760
1.6413	21.0	14448	2.2330	1.0	43.2583	2.1362
1.5941	22.0	15136	2.2390	1.0	42.5969	2.1541
1.5357	23.0	15824	2.2475	1.0	42.0472	2.1901
1.5017	24.0	16512	2.2600	1.0	42.1547	2.2023

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 3

Safetensors

Model size

0.2B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/6fee263a31b18416fd4151e74b993362

Base model

Helsinki-NLP/opus-mt-en-sv

Finetuned

(39)

this model