whisper-tiny-ne-en-finetuned-v2

This model is a fine-tuned version of openai/whisper-tiny for Nepali and English code-switched speech recognition.

Model Details

  • Base Model: openai/whisper-tiny
  • Language: Nepali-English (code-switched)
  • Task: transcribe
  • Fine-tuned on: Custom Nepali ASR dataset

Training Configuration

  • Epochs: 3
  • Batch Size: 16 x 2 = 32 (effective)
  • Learning Rate: 1e-05
  • Mixed Precision: FP16
  • Training Samples: 7576
  • Validation Samples: 945
  • Test Samples: 948

Evaluation Results

Split WER
Test 78.50%

Usage

from transformers import WhisperProcessor, WhisperForConditionalGeneration
import librosa

# Load model and processor
processor = WhisperProcessor.from_pretrained("Bijay13/whisper-tiny-ne-en-finetuned-v2")
model = WhisperForConditionalGeneration.from_pretrained("Bijay13/whisper-tiny-ne-en-finetuned-v2")

# Load audio
audio, sr = librosa.load("your_audio.wav", sr=16000)

# Transcribe
input_features = processor(audio, sampling_rate=16000, return_tensors="pt").input_features
predicted_ids = model.generate(input_features)
transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]
print(transcription)

License

This model is released under the Apache 2.0 license.

Downloads last month
11
Safetensors
Model size
37.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Bijay13/whisper-tiny-ne-en-finetuned-v2

Finetuned
(1679)
this model