mt5-small-finetuned-easy

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 134.6687

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Use OptimizerNames.ADAFACTOR and the args are: No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
7713.7494 0.4080 500 1422.8065
3212.7847 0.8160 1000 534.9083
1562.9995 1.2236 1500 245.6855
1141.0379 1.6316 2000 171.7863
961.2817 2.0392 2500 148.8024
877.5551 2.4472 3000 137.8753
851.0401 2.8552 3500 134.6687

Framework versions

  • PEFT 0.18.1
  • Transformers 5.2.0
  • Pytorch 2.9.0+cu126
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
140
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kallacharanteja/mt5-small-finetuned-easy

Base model

google/mt5-small
Adapter
(33)
this model