Whisper Small โ Nagamese ASR
Fine-tuned from openai/whisper-small on a Nagamese speech corpus using LoRA (r=32, alpha=64) and 8-bit quantization on a Kaggle T4 GPU.
The LoRA adapter has been merged into the base model, so this works out-of-the-box without PEFT installed.
Quick Start
from transformers import WhisperForConditionalGeneration, WhisperProcessor
import torch, librosa
model = WhisperForConditionalGeneration.from_pretrained("Kenei/whisper-small-nagamese-v2")
processor = WhisperProcessor.from_pretrained("Kenei/whisper-small-nagamese-v2")
audio, _ = librosa.load("your_audio.wav", sr=16000) # must be 16kHz mono
inputs = processor(audio, sampling_rate=16000, return_tensors="pt")
with torch.no_grad():
predicted_ids = model.generate(inputs.input_features)
print(processor.batch_decode(predicted_ids, skip_special_tokens=True)[0])
Training Details
| Setting | Value |
|---|---|
| Base model | openai/whisper-small |
| Language | Nagamese (Roman script) |
| LoRA rank / alpha | 32 / 64 |
| LoRA target modules | q_proj, v_proj, k_proj, out_proj, fc1, fc2 |
| Effective batch size | 8 |
| Max steps | 1200 |
| Learning rate | 0.001 |
| Precision | fp16 + 8-bit base |
- Downloads last month
- 9
Model tree for Kenei/whisper-small-nagamese-v2
Base model
openai/whisper-small