mt5-small-finetuned-wikipedia-en-it

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5.6e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 8

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
3.7491	1.0	535	2.8274	18.6213	5.2054	16.2303	17.9296
3.4629	2.0	1070	2.7583	18.0138	4.9942	16.0086	17.2376
3.3094	3.0	1605	2.7396	18.3755	5.2117	16.3702	17.6459
3.2115	4.0	2140	2.7131	18.2747	5.3646	16.2934	17.6114
3.1512	5.0	2675	2.7000	18.9946	5.5465	16.827	18.2593
3.0929	6.0	3210	2.6893	18.631	5.3464	16.5725	17.9697
3.0554	7.0	3745	2.6829	19.0745	5.7031	16.8741	18.3337
3.04	8.0	4280	2.6781	18.9584	5.5332	16.7437	18.2495

Safetensors

Model size

0.3B params

Tensor type

F32

Base model

Finetuned

(667)

this model