whisper-small-ks / README.md

muneebharoon

Training in progress, step 1000

27ec361 verified 10 months ago

preview code

raw

history blame contribute delete

2.56 kB

metadata

library_name: transformers
language:
  - ks
base_model: openai/whisper-small
tags:
  - generated_from_trainer
datasets:
  - muneebharoon/whisper-kashmiri
metrics:
  - wer
model-index:
  - name: Whisper Small ks - Muneeb Haroon
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: whisper-kashmiri
          type: muneebharoon/whisper-kashmiri
          args: 'config: ks, split: test'
        metrics:
          - name: Wer
            type: wer
            value: 39.80769230769231

Whisper Small ks - Muneeb Haroon

This model is a fine-tuned version of openai/whisper-small on the whisper-kashmiri dataset. It achieves the following results on the evaluation set:

Loss: 1.1578
Wer: 39.8077

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 2
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 8
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
training_steps: 10000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
0.0123	21.2811	1000	0.9382	48.125
0.0051	42.5622	2000	0.9946	42.4519
0.0032	63.8432	3000	1.0278	41.3942
0.0	85.1081	4000	1.1138	40.5288
0.0	106.3892	5000	1.1578	39.8077
0.0	127.6703	6000	1.1869	39.8077
0.0	148.9514	7000	1.2211	40.0
0.0	170.2162	8000	1.2430	40.2404
0.0	191.4973	9000	1.2679	40.2885
0.0	212.7784	10000	1.2762	40.3365

Framework versions

Transformers 4.49.0
Pytorch 2.5.1+cu124
Datasets 3.2.0
Tokenizers 0.21.0