fd09585f19f5c220b2234324749e114e

This model is a fine-tuned version of facebook/mbart-large-50-many-to-one-mmt on the Helsinki-NLP/opus_books [es-fr] dataset. It achieves the following results on the evaluation set:

Loss: 1.8174
Data Size: 1.0
Epoch Runtime: 352.1614
Bleu: 29.5074

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	5.0951	0	29.1309	1.7148
No log	1	1407	3.3811	0.0078	32.7518	3.3974
No log	2	2814	2.8146	0.0156	34.5016	5.1237
0.0745	3	4221	2.4524	0.0312	41.2401	6.9669
2.3124	4	5628	2.1677	0.0625	51.9868	9.4383
2.0109	5	7035	1.9225	0.125	72.9908	9.8335
1.7127	6	8442	1.7219	0.25	112.5741	12.4547
1.5263	7	9849	1.5440	0.5	190.7074	12.6929
1.2852	8.0	11256	1.4322	1.0	350.8105	19.9970
1.0598	9.0	12663	1.4018	1.0	350.2182	16.3399
0.9016	10.0	14070	1.4703	1.0	350.7908	16.9793
0.6981	11.0	15477	1.5481	1.0	352.8656	17.6315
0.5415	12.0	16884	1.6658	1.0	352.6389	15.9129
0.4499	13.0	18291	1.8174	1.0	352.1614	29.5074

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 1

Safetensors

Model size

0.2B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/fd09585f19f5c220b2234324749e114e

Base model

facebook/mbart-large-50-many-to-one-mmt

Finetuned

(42)

this model