f9555671916d6a1c49b20ee565042596

This model is a fine-tuned version of facebook/mbart-large-50 on the Helsinki-NLP/opus_books [en-es] dataset. It achieves the following results on the evaluation set:

Loss: 2.2366
Data Size: 1.0
Epoch Runtime: 593.6930
Bleu: 12.1406

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	7.0719	0	47.9245	0.6300
No log	1	2336	4.2021	0.0078	53.1628	4.3514
0.0625	2	4672	2.9122	0.0156	58.4693	7.3057
0.0654	3	7008	2.3167	0.0312	68.1028	9.2430
2.2312	4	9344	2.1922	0.0625	84.7480	11.9665
2.1281	5	11680	2.0670	0.125	118.1217	11.3258
1.967	6	14016	1.9889	0.25	184.4479	12.5782
1.8406	7	16352	1.8683	0.5	321.2975	11.4127
1.8107	8.0	18688	1.8419	1.0	592.0372	12.5327
1.4679	9.0	21024	1.8041	1.0	590.9694	15.5184
1.2715	10.0	23360	1.8486	1.0	589.8631	18.2871
1.0471	11.0	25696	1.9585	1.0	593.7643	15.5111
0.8496	12.0	28032	2.0721	1.0	593.0644	14.0928
0.6951	13.0	30368	2.2366	1.0	593.6930	12.1406

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 1

Safetensors

Model size

0.2B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/f9555671916d6a1c49b20ee565042596

Base model

facebook/mbart-large-50

Finetuned

(346)

this model