7329e3dbef659f25cbfec233dfb6b918

This model is a fine-tuned version of facebook/mbart-large-cc25 on the Helsinki-NLP/opus_books [es-fr] dataset. It achieves the following results on the evaluation set:

Loss: 1.8008
Data Size: 1.0
Epoch Runtime: 360.0285
Bleu: 13.4356

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	11.3739	0	30.6129	0.1500
No log	1	1407	2.8983	0.0078	33.2640	9.3701
No log	2	2814	2.4695	0.0156	36.1908	8.4509
0.0716	3	4221	2.1366	0.0312	43.0911	7.8403
2.0567	4	5628	1.8334	0.0625	54.5033	9.7853
1.8467	5	7035	1.7156	0.125	76.0168	10.4065
1.6344	6	8442	1.5890	0.25	115.8952	13.1721
9.206	7	9849	6.1961	0.5	198.0810	0.0665
1.3719	8.0	11256	1.4536	1.0	360.8187	14.3740
1.1472	9.0	12663	1.4082	1.0	360.9548	13.6077
1.0182	10.0	14070	1.4904	1.0	359.3424	13.8715
1.0956	11.0	15477	1.6043	1.0	358.9803	13.1410
0.6615	12.0	16884	1.6406	1.0	359.4590	13.5476
0.5592	13.0	18291	1.8008	1.0	360.0285	13.4356

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: -

Safetensors

Model size

0.2B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/7329e3dbef659f25cbfec233dfb6b918

Base model

facebook/mbart-large-cc25

Finetuned

(67)

this model