mt5-small-finetuned-amazon-en-es
This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:
- Loss: 2.7874
- Rouge1: 18.5523
- Rouge2: 10.0675
- Rougel: 17.5229
- Rougelsum: 17.5494
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5.6e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 8
Training results
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
|---|---|---|---|---|---|---|---|
| 4.1365 | 1.0 | 611 | 3.0261 | 17.3417 | 8.0201 | 17.0052 | 17.1328 |
| 3.8953 | 2.0 | 1222 | 2.8934 | 17.6971 | 8.8023 | 16.726 | 16.9165 |
| 3.6211 | 3.0 | 1833 | 2.8633 | 19.8278 | 10.349 | 18.782 | 18.9807 |
| 3.4721 | 4.0 | 2444 | 2.8363 | 17.9441 | 9.9183 | 16.9501 | 16.9501 |
| 3.3561 | 5.0 | 3055 | 2.8246 | 17.5136 | 9.1469 | 16.4998 | 16.5171 |
| 3.2596 | 6.0 | 3666 | 2.8041 | 18.4253 | 10.0062 | 17.4766 | 17.5163 |
| 3.2331 | 7.0 | 4277 | 2.7885 | 18.2069 | 10.3961 | 17.2279 | 17.2918 |
| 3.1979 | 8.0 | 4888 | 2.7874 | 18.5523 | 10.0675 | 17.5229 | 17.5494 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu126
- Datasets 3.6.0
- Tokenizers 0.22.1
- Downloads last month
- 1
Model tree for Ara113/mt5-small-finetuned-amazon-en-es
Base model
google/mt5-small