Whisper smail zh - Song train

This model is a fine-tuned version of openai/whisper-small on the Chinese songs * 58 dataset. It achieves the following results on the evaluation set:

  • Loss: 2.3571
  • Wer: 23.1198

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 200
  • training_steps: 10000
  • mixed_precision_training: Native AMP
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Validation Loss Wer
2.7182 0.9456 100 2.4663 74.5125
2.1609 1.8889 200 2.1933 29.2015
1.8165 2.8322 300 2.1198 25.9517
1.6089 3.7754 400 2.1005 24.7447
1.5458 4.7187 500 2.1080 24.4661
1.5067 5.6619 600 2.0911 24.4197
1.4875 6.6052 700 2.1234 23.8626
1.4783 7.5485 800 2.1006 26.7409
1.4637 8.4917 900 2.1462 23.7233
1.4595 9.4350 1000 2.1491 24.5590
1.4526 10.3783 1100 2.1464 24.2804
1.4499 11.3215 1200 2.1496 23.2591
1.4424 12.2648 1300 2.1723 25.4875
1.4432 13.2080 1400 2.1740 24.4197
1.4411 14.1513 1500 2.1619 23.3519
1.4364 15.0946 1600 2.1972 43.8254
1.4366 16.0378 1700 2.1931 22.9805
1.4353 16.9835 1800 2.2018 23.3983
1.4319 17.9267 1900 2.2067 23.3519
1.431 18.8700 2000 2.2079 22.5162
1.4303 19.8132 2100 2.2221 22.6555
1.4277 20.7565 2200 2.2354 22.5162
1.4266 21.6998 2300 2.2289 22.8877
1.4261 22.6430 2400 2.2336 22.7484
1.4251 23.5863 2500 2.2423 23.3519
1.4238 24.5296 2600 2.2548 23.0269
1.4224 25.4728 2700 2.2598 23.0269
1.4225 26.4161 2800 2.2682 22.6555
1.4214 27.3593 2900 2.2712 22.4234
1.4211 28.3026 3000 2.2869 22.6555
1.4207 29.2459 3100 2.2880 22.4234
1.4197 30.1891 3200 2.2848 22.4234
1.4207 31.1324 3300 2.3099 22.1913
1.4193 32.0757 3400 2.3111 22.2377
1.4192 33.0189 3500 2.3284 22.4698
1.4187 33.9645 3600 2.3349 22.8877
1.4187 34.9078 3700 2.3347 22.8877
1.4181 35.8511 3800 2.3441 22.6091
1.4188 36.7943 3900 2.3338 22.6091
1.4182 37.7376 4000 2.3462 22.2377
1.4183 38.6809 4100 2.3396 22.4698
1.4183 39.6241 4200 2.3441 22.3770
1.4179 40.5674 4300 2.3571 23.1198

Framework versions

  • Transformers 4.56.2
  • Pytorch 2.7.1+cu118
  • Datasets 4.1.1
  • Tokenizers 0.22.1
Downloads last month
52
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Zzzkay1/whisper-small-zh

Finetuned
(3407)
this model

Evaluation results