flan-t5-small-gec

This model is a fine-tuned version of google/flan-t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3409
  • Sacrebleu: 80.4351

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Sacrebleu
0.4499 0.2591 1000 0.3853 79.3662
0.4321 0.5181 2000 0.3655 79.6704
0.4061 0.7772 3000 0.3589 80.0453
0.4015 1.0363 4000 0.3518 80.1396
0.4114 1.2953 5000 0.3480 80.1401
0.4088 1.5544 6000 0.3445 80.2909
0.3998 1.8135 7000 0.3441 80.2896
0.3851 2.0725 8000 0.3429 80.3305
0.379 2.3316 9000 0.3402 80.3939
0.3756 2.5907 10000 0.3411 80.4061
0.3807 2.8497 11000 0.3409 80.4351

Framework versions

  • Transformers 4.56.1
  • Pytorch 2.8.0+cu126
  • Datasets 4.0.0
  • Tokenizers 0.22.0
Downloads last month
1
Safetensors
Model size
77M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lmaccarini/flan-t5-small-gec

Finetuned
(472)
this model