mt5-small-finetuned-medium

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.9152

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Use OptimizerNames.ADAFACTOR and the args are: No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 0.03
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
23.6513 0.4578 500 4.2624
22.3063 0.9155 1000 4.1012
21.6209 1.3726 1500 4.0121
21.1642 1.8304 2000 3.9515
21.0332 2.2875 2500 3.9277
20.8154 2.7453 3000 3.9152

Framework versions

  • PEFT 0.18.1
  • Transformers 5.2.0
  • Pytorch 2.9.0+cu126
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
73
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kallacharanteja/mt5-small-finetuned-medium

Base model

google/mt5-small
Adapter
(33)
this model