You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

MWirelabs/ne-asr

A multilingual automatic speech recognition (ASR) model for eight Northeast Indian languages, fine-tuned from openai/whisper-medium.

Languages

Language Code Test WER
Khasi kha 16.89%
Garo grt 9.31%
Mizo lus 23.85%
Nagamese nag 49.13%
Kokborok trp 44.79%
Assamese asm 20.98%
Chakma ccp 54.25%
Wancho wao 68.37%
Overall 36.06%

Training Data

  • Vaani (ARTPARK-IISc/Vaani-transcription-part): 121,960 training samples across 7 languages
  • Proprietary MWire corpus: 28,524 training samples across 5 languages (Khasi, Garo, Mizo, Nagamese, Kokborok)
  • Total: 150,483 training samples

Training Details

  • Base model: openai/whisper-medium
  • Learning rate: 1e-5 with 500 warmup steps
  • Steps: 8,000
  • Batch size: 16 (gradient accumulation 2, effective 32)
  • Mixed precision: fp16
  • Language token: welsh proxy for all languages except Assamese

Usage

from transformers import WhisperProcessor, WhisperForConditionalGeneration
import soundfile as sf
import torch

processor = WhisperProcessor.from_pretrained("MWirelabs/ne-asr")
model = WhisperForConditionalGeneration.from_pretrained("MWirelabs/ne-asr")

audio, sr = sf.read("audio.wav")
inputs = processor(audio, sampling_rate=16000, return_tensors="pt")

# Force language token (use "welsh" for all except Assamese)
forced_ids = processor.get_decoder_prompt_ids(language="welsh", task="transcribe")
with torch.no_grad():
    predicted_ids = model.generate(inputs.input_features, forced_decoder_ids=forced_ids)

transcription = processor.decode(predicted_ids[0], skip_special_tokens=True)
print(transcription)

Citation

Paper forthcoming. If you use this model, please cite:

@misc{mwirelabs2026nearsr, title={NE-MultiSpeech: Multilingual ASR for Northeast Indian Languages}, author={MWire Labs}, year={2026} }

License

CC-BY-4.0. Developed by MWire Labs, Shillong, Meghalaya.

Downloads last month
13
Safetensors
Model size
0.8B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support