Whisper medium Ps - ZFA

This model is a fine-tuned version of openai/whisper-medium on a custom Pashto speech dataset. It achieves the following results on the evaluation set:

Loss: 0.7001
Wer: 27.124

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
gradient_accumulation_steps: 16
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: SchedulerType.LINEAR
lr_scheduler_warmup_steps: 300
training_steps: 2000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
No log	5.1297	400	0.5575	30.9354
11.8956	10.2593	800	0.623	28.5533
0.2548	15.389	1200	0.67	27.124
0.0164	20.5186	1600	0.6933	27.0446
0.0046	25.6483	2000	0.7001	27.124

Framework versions

Transformers 5.0.0
Pytorch 2.10.0+cpu
Datasets 4.0.0
Tokenizers 0.22.2

Downloads last month: 2

Safetensors

Model size

0.8B params

Tensor type

F32

Model tree for Zarnabh/whisper-medium-pashto

Base model

openai/whisper-medium

Finetuned

(874)

this model