You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Whisper Small — Nagamese ASR

Fine-tuned from openai/whisper-small on a Nagamese speech corpus using LoRA (r=32, alpha=64) and 8-bit quantization on a Kaggle T4 GPU.

The LoRA adapter has been merged into the base model, so this works out-of-the-box without PEFT installed.

Quick Start

from transformers import WhisperForConditionalGeneration, WhisperProcessor
import torch, librosa
 
model     = WhisperForConditionalGeneration.from_pretrained("Kenei/whisper-small-nagamese-v2")
processor = WhisperProcessor.from_pretrained("Kenei/whisper-small-nagamese-v2")
 
audio, _ = librosa.load("your_audio.wav", sr=16000)  # must be 16kHz mono
 
inputs = processor(audio, sampling_rate=16000, return_tensors="pt")
with torch.no_grad():
    predicted_ids = model.generate(inputs.input_features)
 
print(processor.batch_decode(predicted_ids, skip_special_tokens=True)[0])

Training Details

Setting	Value
Base model	`openai/whisper-small`
Language	Nagamese (Roman script)
LoRA rank / alpha	32 / 64
LoRA target modules	q_proj, v_proj, k_proj, out_proj, fc1, fc2
Effective batch size	8
Max steps	1200
Learning rate	0.001
Precision	fp16 + 8-bit base

Downloads last month: 9

Safetensors

Model size

0.2B params

Tensor type

F16

Model tree for Kenei/whisper-small-nagamese-v2

Base model

openai/whisper-small

Adapter

(234)

this model