Whisper Small β€” SPRINGLab Hindi Fine-tuned πŸŽ™οΈ

Fine-tuned version of openai/whisper-small for Hindi automatic speech recognition using LoRA (PEFT).

Model Details

Parameter Value
Base Model openai/whisper-small
Dataset SPRINGLab/IndicVoices-R_Hindi
Train Samples 25,002
Eval Samples 1,316
Training Epochs 3
Training Steps 2,346
Best Checkpoint checkpoint-2346
Best Eval Loss 0.2637
Best Eval WER 26.52
20-sample Base WER 59.44
20-sample FT WER 20.85
LoRA Rank 16
LoRA Alpha 32
LoRA Dropout 0.05
LoRA Targets q_proj, v_proj
Learning Rate 5e-5
Train Batch Size 8
Grad Accumulation 4
Effective Batch 32
Precision bfloat16
Hardware Google Colab A100
Method LoRA fine-tuning with PEFT

Validation Summary

The final selected checkpoint was checkpoint-2346, which was manually evaluated after training and slightly outperformed checkpoint-2000 on the full validation split.

  • checkpoint-2000 eval WER: 26.6488
  • checkpoint-2346 eval WER: 26.52

Usage

from transformers import pipeline

asr = pipeline(
    task='automatic-speech-recognition',
    model='Sa1Krishna/sema-whisper-small-springlab-hindi-finetuned',
    device=0
)

result = asr(
    'hindi_audio.wav',
    generate_kwargs={
        'language': 'hindi',
        'task': 'transcribe'
    }
)

print(result['text'])

Training Details

Trained on SPRINGLab/IndicVoices-R_Hindi using a 95/5 train-validation split and Hindi normalized transcripts.

Training Config

  • Framework: Hugging Face Transformers + PEFT
  • Fine-tuning method: LoRA
  • Precision: bfloat16
  • Learning rate: 5e-5
  • Batch size: 8
  • Gradient accumulation: 4
  • Evaluation cadence: every 500 steps
  • Sanity check: 20-sample qualitative comparison at step 200

Notes

  • Fine-tuned for Hindi speech recognition.
  • Uses the multilingual Whisper tokenizer and decoder for Hindi transcription.
  • Final checkpoint was chosen using full-validation WER plus qualitative review.

Limitations

  • Optimized for Hindi ASR only.
  • May still struggle with heavy accents, rare proper nouns, and unusual numerals.
  • Performance may vary on domains very different from IndicVoices-R.
Downloads last month
-
Safetensors
Model size
0.2B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Sa1Krishna/sema-whisper-small-springlab-hindi-finetuned

Adapter
(227)
this model

Dataset used to train Sa1Krishna/sema-whisper-small-springlab-hindi-finetuned