Whisper Tiny - Toki Pona - Synthetic Test 1
This experimental model is a fine-tuned version of openai/whisper-tiny on a mix of custom synthetic data and Common Voice 23.0 - Toki Pona.
The evaluation set contains synthetic data, as opposed to the whisper-small-based model, so evaluation values are not provided.
Model description
This is an experimental model trained for speech recognition for Toki Pona.
As the original model is multilingual with explicit language specification tokens, we have replaced the Czech (cs) language with Toki Pona, as we have determined it to have the closest phonetics.
The model's performance for other languages may have been at least partially preserved, but no testing has been done for other languages.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 64
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 128
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- training_steps: 1000
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Wer |
|---|---|---|---|---|
| 0.2699 | 0.5155 | 100 | 0.3009 | 16.1318 |
| 0.0985 | 1.0309 | 200 | 0.1711 | 9.6271 |
| 0.0728 | 1.5464 | 300 | 0.1388 | 7.9503 |
| 0.0547 | 2.0619 | 400 | 0.1254 | 7.4299 |
| 0.0474 | 2.5773 | 500 | 0.1178 | 6.6493 |
| 0.039 | 3.0928 | 600 | 0.1128 | 6.5337 |
| 0.0364 | 3.6082 | 700 | 0.1091 | 6.3024 |
| 0.0314 | 4.1237 | 800 | 0.1072 | 6.2735 |
| 0.0304 | 4.6392 | 900 | 0.1059 | 6.3024 |
| 0.0288 | 5.1546 | 1000 | 0.1051 | 6.3891 |
Framework versions
- Transformers 4.50.3
- Pytorch 2.9.0+cu126
- Datasets 3.6.0
- Tokenizers 0.21.4
- Downloads last month
- 3
Model tree for Tomeno/whisper-tiny-tok-synth-1
Base model
openai/whisper-tiny