mt5_base-silver-standard-lora

This model is a fine-tuned version of google/mt5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.0014
  • Rouge1: 37.7941
  • Rouge2: 22.0953
  • Rougel: 36.8509
  • Rougelsum: 36.9714

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 3
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
35.0066 1.0 149 3.0369 15.5109 5.2653 15.0068 15.042
12.4424 2.0 298 2.0791 33.6692 19.2782 32.7224 32.7674
11.1139 3.0 447 2.0014 37.7941 22.0953 36.8509 36.9714

Framework versions

  • PEFT 0.18.1
  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.8.3
  • Tokenizers 0.22.2
Downloads last month
155
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Firmansyah-Ibrahim/mt5_base-silver-standard-lora

Base model

google/mt5-base
Adapter
(37)
this model