mt5-small-finetuned-amazon-en-es

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5.6e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 8

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
7.1842	1.0	1209	3.3246	15.0089	6.5237	14.4095	14.2924
3.9146	2.0	2418	3.1735	16.7976	8.3402	16.3224	16.0651
3.6022	3.0	3627	3.1338	17.795	9.0478	17.2897	17.1568
3.4148	4.0	4836	3.0935	18.0448	9.1484	17.525	17.3333
3.3189	5.0	6045	3.0602	16.6951	7.8279	16.1867	15.9983
3.2451	6.0	7254	3.0384	16.1576	7.7209	15.9061	15.6342
3.2012	7.0	8463	3.0337	16.3761	8.0505	16.1882	16.0124
3.1719	8.0	9672	3.0328	16.4389	8.0419	16.2131	16.0484

Safetensors

Model size

0.3B params

Tensor type

F32

Base model

Finetuned

(666)

this model