Whisper Tiny - Toki Pona - Synthetic Test 1

This experimental model is a fine-tuned version of openai/whisper-tiny on a mix of custom synthetic data and Common Voice 23.0 - Toki Pona.

The evaluation set contains synthetic data, as opposed to the whisper-small-based model, so evaluation values are not provided.

Model description

This is an experimental model trained for speech recognition for Toki Pona.

As the original model is multilingual with explicit language specification tokens, we have replaced the Czech (cs) language with Toki Pona, as we have determined it to have the closest phonetics.

The model's performance for other languages may have been at least partially preserved, but no testing has been done for other languages.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 64
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 128
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
training_steps: 1000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
0.2699	0.5155	100	0.3009	16.1318
0.0985	1.0309	200	0.1711	9.6271
0.0728	1.5464	300	0.1388	7.9503
0.0547	2.0619	400	0.1254	7.4299
0.0474	2.5773	500	0.1178	6.6493
0.039	3.0928	600	0.1128	6.5337
0.0364	3.6082	700	0.1091	6.3024
0.0314	4.1237	800	0.1072	6.2735
0.0304	4.6392	900	0.1059	6.3024
0.0288	5.1546	1000	0.1051	6.3891

Framework versions

Transformers 4.50.3
Pytorch 2.9.0+cu126
Datasets 3.6.0
Tokenizers 0.21.4

Downloads last month: 1

Safetensors

Model size

37.8M params

Tensor type

F32

Model tree for Tomeno/whisper-tiny-tok-synth-1

Base model

openai/whisper-tiny

Finetuned

(1826)

this model