mt5-small-finetuned-amazon-en-es

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5.6e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 8

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
2.952	1.0	1210	3.5160	16.149	7.6589	15.7063	15.8048
3.5927	2.0	2420	3.2454	15.7542	6.9805	15.0122	15.1377
3.387	3.0	3630	3.1996	15.5955	7.0586	14.9107	14.9706
3.2716	4.0	4840	3.2031	15.9245	7.0341	15.1523	15.2033
3.1942	5.0	6050	3.1697	15.7103	7.3608	15.1575	15.1283
3.1355	6.0	7260	3.1601	16.1965	7.2843	15.452	15.4632
3.109	7.0	8470	3.1511	16.3021	7.3938	15.5369	15.5867
3.0806	8.0	9680	3.1507	16.9344	7.9189	16.1455	16.2455

Safetensors

Model size

0.3B params

Tensor type

F32

Base model

Finetuned

(670)

this model