mt5-small-finetuned-amazon-en-es

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.7874
  • Rouge1: 18.5523
  • Rouge2: 10.0675
  • Rougel: 17.5229
  • Rougelsum: 17.5494

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
4.1365 1.0 611 3.0261 17.3417 8.0201 17.0052 17.1328
3.8953 2.0 1222 2.8934 17.6971 8.8023 16.726 16.9165
3.6211 3.0 1833 2.8633 19.8278 10.349 18.782 18.9807
3.4721 4.0 2444 2.8363 17.9441 9.9183 16.9501 16.9501
3.3561 5.0 3055 2.8246 17.5136 9.1469 16.4998 16.5171
3.2596 6.0 3666 2.8041 18.4253 10.0062 17.4766 17.5163
3.2331 7.0 4277 2.7885 18.2069 10.3961 17.2279 17.2918
3.1979 8.0 4888 2.7874 18.5523 10.0675 17.5229 17.5494

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu126
  • Datasets 3.6.0
  • Tokenizers 0.22.1
Downloads last month
1
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Ara113/mt5-small-finetuned-amazon-en-es

Base model

google/mt5-small
Finetuned
(647)
this model