Whisper smail zh - Song train
This model is a fine-tuned version of openai/whisper-small on the Chinese songs * 58 dataset. It achieves the following results on the evaluation set:
- Loss: 2.3571
- Wer: 23.1198
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 4
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 16
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 200
- training_steps: 10000
- mixed_precision_training: Native AMP
- label_smoothing_factor: 0.1
Training results
| Training Loss | Epoch | Step | Validation Loss | Wer |
|---|---|---|---|---|
| 2.7182 | 0.9456 | 100 | 2.4663 | 74.5125 |
| 2.1609 | 1.8889 | 200 | 2.1933 | 29.2015 |
| 1.8165 | 2.8322 | 300 | 2.1198 | 25.9517 |
| 1.6089 | 3.7754 | 400 | 2.1005 | 24.7447 |
| 1.5458 | 4.7187 | 500 | 2.1080 | 24.4661 |
| 1.5067 | 5.6619 | 600 | 2.0911 | 24.4197 |
| 1.4875 | 6.6052 | 700 | 2.1234 | 23.8626 |
| 1.4783 | 7.5485 | 800 | 2.1006 | 26.7409 |
| 1.4637 | 8.4917 | 900 | 2.1462 | 23.7233 |
| 1.4595 | 9.4350 | 1000 | 2.1491 | 24.5590 |
| 1.4526 | 10.3783 | 1100 | 2.1464 | 24.2804 |
| 1.4499 | 11.3215 | 1200 | 2.1496 | 23.2591 |
| 1.4424 | 12.2648 | 1300 | 2.1723 | 25.4875 |
| 1.4432 | 13.2080 | 1400 | 2.1740 | 24.4197 |
| 1.4411 | 14.1513 | 1500 | 2.1619 | 23.3519 |
| 1.4364 | 15.0946 | 1600 | 2.1972 | 43.8254 |
| 1.4366 | 16.0378 | 1700 | 2.1931 | 22.9805 |
| 1.4353 | 16.9835 | 1800 | 2.2018 | 23.3983 |
| 1.4319 | 17.9267 | 1900 | 2.2067 | 23.3519 |
| 1.431 | 18.8700 | 2000 | 2.2079 | 22.5162 |
| 1.4303 | 19.8132 | 2100 | 2.2221 | 22.6555 |
| 1.4277 | 20.7565 | 2200 | 2.2354 | 22.5162 |
| 1.4266 | 21.6998 | 2300 | 2.2289 | 22.8877 |
| 1.4261 | 22.6430 | 2400 | 2.2336 | 22.7484 |
| 1.4251 | 23.5863 | 2500 | 2.2423 | 23.3519 |
| 1.4238 | 24.5296 | 2600 | 2.2548 | 23.0269 |
| 1.4224 | 25.4728 | 2700 | 2.2598 | 23.0269 |
| 1.4225 | 26.4161 | 2800 | 2.2682 | 22.6555 |
| 1.4214 | 27.3593 | 2900 | 2.2712 | 22.4234 |
| 1.4211 | 28.3026 | 3000 | 2.2869 | 22.6555 |
| 1.4207 | 29.2459 | 3100 | 2.2880 | 22.4234 |
| 1.4197 | 30.1891 | 3200 | 2.2848 | 22.4234 |
| 1.4207 | 31.1324 | 3300 | 2.3099 | 22.1913 |
| 1.4193 | 32.0757 | 3400 | 2.3111 | 22.2377 |
| 1.4192 | 33.0189 | 3500 | 2.3284 | 22.4698 |
| 1.4187 | 33.9645 | 3600 | 2.3349 | 22.8877 |
| 1.4187 | 34.9078 | 3700 | 2.3347 | 22.8877 |
| 1.4181 | 35.8511 | 3800 | 2.3441 | 22.6091 |
| 1.4188 | 36.7943 | 3900 | 2.3338 | 22.6091 |
| 1.4182 | 37.7376 | 4000 | 2.3462 | 22.2377 |
| 1.4183 | 38.6809 | 4100 | 2.3396 | 22.4698 |
| 1.4183 | 39.6241 | 4200 | 2.3441 | 22.3770 |
| 1.4179 | 40.5674 | 4300 | 2.3571 | 23.1198 |
Framework versions
- Transformers 4.56.2
- Pytorch 2.7.1+cu118
- Datasets 4.1.1
- Tokenizers 0.22.1
- Downloads last month
- 52
Model tree for Zzzkay1/whisper-small-zh
Base model
openai/whisper-smallEvaluation results
- Wer on Chinese songs * 58self-reported23.120