1972e1d5a4947dba808d9ac596b5f3ef

This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-sv on the Helsinki-NLP/opus_books [fr-it] dataset. It achieves the following results on the evaluation set:

Loss: 2.2994
Data Size: 1.0
Epoch Runtime: 23.6897
Bleu: 3.6435

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	7.4606	0	2.4201	0.0703
No log	1	367	6.5031	0.0078	3.2722	0.0797
No log	2	734	5.9608	0.0156	2.9851	0.1151
No log	3	1101	5.3978	0.0312	3.3716	0.1513
No log	4	1468	4.8046	0.0625	4.0370	0.2627
0.2675	5	1835	4.2769	0.125	5.5947	0.4672
4.1854	6	2202	3.7770	0.25	8.5533	0.7643
3.574	7	2569	3.3002	0.5	13.2086	1.2913
3.0679	8.0	2936	2.8905	1.0	24.0043	2.0655
2.7491	9.0	3303	2.7013	1.0	22.7001	2.4616
2.5439	10.0	3670	2.5665	1.0	23.3464	2.7497
2.3811	11.0	4037	2.4891	1.0	23.1825	2.9385
2.2709	12.0	4404	2.4098	1.0	23.2766	3.1351
2.2014	13.0	4771	2.3772	1.0	23.1399	3.2306
2.0504	14.0	5138	2.3138	1.0	23.7488	3.3399
2.0132	15.0	5505	2.2986	1.0	23.8574	3.4083
1.9043	16.0	5872	2.2959	1.0	22.8362	3.5295
1.8562	17.0	6239	2.2807	1.0	23.8024	3.4761
1.784	18.0	6606	2.2624	1.0	22.8829	3.5573
1.7418	19.0	6973	2.2592	1.0	22.9862	3.5383
1.6799	20.0	7340	2.2541	1.0	24.7330	3.5706
1.5977	21.0	7707	2.2615	1.0	24.1643	3.5601
1.5477	22.0	8074	2.2701	1.0	24.4189	3.6217
1.4741	23.0	8441	2.2841	1.0	23.1823	3.6329
1.4327	24.0	8808	2.2994	1.0	23.6897	3.6435

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: -

Safetensors

Model size

0.2B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/1972e1d5a4947dba808d9ac596b5f3ef

Base model

Helsinki-NLP/opus-mt-en-sv

Finetuned

(39)

this model