Ethio-ASR 🇪🇹 💬
Collection
A suite of multilingual CTC-based ASR models for Ethiopian languages • 9 items • Updated
Ethio-ASR is a suite of multilingual Automatic Speech Recognition (ASR) models that support five Ethiopian languages: for Amharic, Tigrinya, Afaan Oromo, Sidama, and Wolaytta. The ASR model in this repo is based on the mms-1b pre-trained model by fine-tuning it on the WAXAL Speech Dataset.
📌 ASR model in this HF repo
| Model | # Params | Amharic | Tigrinya | Oromo | Wolaytta | Sidaama | Avg. |
|---|---|---|---|---|---|---|---|
| Ethio-ASR (afrihubert) | 92M | 30.95 | 42.42 | 27.57 | 40.44 | 34.02 | 35.08 |
| Ethio-ASR (mms-300) | 300M | 30.19 | 41.62 | 26.41 | 39.10 | 32.66 | 33.99 |
| Ethio-ASR (mms-1b) 📌 | 1B | 26.14 | 37.63 | 23.69 | 37.51 | 31.02 | 31.20 |
| Ethio-ASR (w2v-bert-2.0) | 600M | 22.92 | 35.22 | 24.44 | 38.19 | 31.65 | 30.48 |
from transformers import AutoModelForCTC, AutoProcessor
import torchaudio, torch
processor = AutoProcessor.from_pretrained("badrex/Ethio-ASR-multilingual-1B")
model = AutoModelForCTC.from_pretrained("badrex/Ethio-ASR-multilingual-1B")
audio, sr = torchaudio.load("audio.wav")
inputs = processor(audio.squeeze(), sampling_rate=sr, return_tensors="pt")
with torch.no_grad():
logits = model(**inputs).logits
pred_ids = torch.argmax(logits, dim=-1)
transcription = processor.batch_decode(pred_ids)[0]
print(transcription)
Performance might vary across dialects, genders, ages, and recording quality.
@misc{ethio_asr_2026,
author = {
Abdullah, Badr M. and
Azime, Israel Abebe and
Tonja, Atnafu Lambebo and
Alabi, Jesujoba O. and
Alemu, Abel Mulat and
Hagos, Eyob G. and
Balcha, Bontu Fufa and
Nerea, Mulubrhan A. and
Yadeta, Debela Desalegn and
Marilign, Dagnachew Mekonnen and
Fentahun, Amanuel Temesgen and
Kebede, Tadesse and
Gebru, Israel D. and
Woldeyohannis, Michael Melese and
Sewunetie, Walelign Tewabe and
Möbius, Bernd and
Klakow, Dietrich
},
title = {Ethio-ASR: Joint Multilingual Speech Recognition and Language Identification for Ethiopian Languages},
year = {2026},
howpublished = {\url{https://huggingface.co/badrex/Ethio-ASR-multilingual-1B}}
}
Base model
facebook/mms-1b