whisper-medium-Split-Sentences-cleanpunc

This model is a fine-tuned version of openai/whisper-medium on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6916
  • Cer: 15.8237
  • Wer: 27.8527

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 25
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Cer Wer
2.5398 1.0 1353 0.6961 33.6482 54.1036
0.9355 2.0 2706 0.6405 34.2321 51.6942
0.7293 3.0 4059 0.5997 28.1857 44.6026
0.6003 4.0 5412 0.5919 26.6651 42.5012
0.5051 5.0 6765 0.5781 26.4160 40.6325
0.4287 6.0 8118 0.5792 22.4282 36.1832
0.3646 7.0 9471 0.5839 19.7198 32.6853
0.3112 8.0 10824 0.5972 18.8569 31.7544
0.2648 9.0 12177 0.6061 19.4667 32.7127
0.2258 10.0 13530 0.6073 19.1717 33.1166
0.1928 11.0 14883 0.6155 16.8899 29.5092
0.1649 12.0 16236 0.6273 17.8584 30.6044
0.1409 13.0 17589 0.6338 17.9501 30.4401
0.1219 14.0 18942 0.6458 17.9700 30.4128
0.1047 15.0 20295 0.6494 17.1589 29.6530
0.0906 16.0 21648 0.6551 17.1569 29.4065
0.0803 17.0 23001 0.6580 15.9931 27.9211
0.0701 18.0 24354 0.6706 16.2163 28.3045
0.0621 19.0 25707 0.6736 16.3358 28.3798
0.0561 20.0 27060 0.6802 16.5411 28.6125
0.0508 21.0 28413 0.6810 16.0489 28.0649
0.0463 22.0 29766 0.6884 16.1525 28.2223
0.0434 23.0 31119 0.6916 15.8237 27.8527
0.0409 24.0 32472 0.6930 16.0768 28.0101
0.0393 25.0 33825 0.6935 16.3697 28.2429

Framework versions

  • Transformers 4.53.3
  • Pytorch 2.7.1+cu118
  • Datasets 3.6.0
  • Tokenizers 0.21.2
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NgQuocThai/whisper-medium-Split-Sentences-cleanpunc

Finetuned
(772)
this model