google/WaxalNLP
Viewer โข Updated โข 2.56M โข 35.3k โข 224
How to use badrex/Ethio-ASR-multilingual-600M with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="badrex/Ethio-ASR-multilingual-600M") # Load model directly
from transformers import AutoProcessor, AutoModelForCTC
processor = AutoProcessor.from_pretrained("badrex/Ethio-ASR-multilingual-600M")
model = AutoModelForCTC.from_pretrained("badrex/Ethio-ASR-multilingual-600M")
Ethio-ASR is a suite of multilingual Automatic Speech Recognition (ASR) models that support five Ethiopian languages: Amharic, Tigrinya, Afaan Oromo, Sidama, and Wolaytta. The ASR model in this repo is based on the wav2vec2โbert-2.0 pre-trained model by fine-tuning it on the WAXAL Speech Dataset.
๐ ASR model in this HF repo
| Model | # Params | Amharic | Tigrinya | Oromo | Wolaytta | Sidaama | Avg. |
|---|---|---|---|---|---|---|---|
| Ethio-ASR (afrihubert) | 92M | 30.95 | 42.42 | 27.57 | 40.44 | 34.02 | 35.08 |
| Ethio-ASR (mms-300) | 300M | 30.19 | 41.62 | 26.41 | 39.10 | 32.66 | 33.99 |
| Ethio-ASR (mms-1b) | 1B | 26.14 | 37.63 | 23.69 | 37.51 | 31.02 | 31.20 |
| Ethio-ASR (w2v-bert-2.0) ๐ | 600M | 22.92 | 35.22 | 24.44 | 38.19 | 31.65 | 30.48 |
| Language | Audio | Human Transcription | ASR Transcription | |
|---|---|---|---|---|
| 1 | Oromo | Suuraan asii gaditti argaa jirru kun lafa gurgurtaa kuduraa fi muduraa dha. Kuduraa fi muduraan nyaataaf kan baay'ee namatti toluudha. Nyaachuudhaaf illee kuduraa fi muduraan baay'ee filataamaa dha. Kanaaf, kuduraa fi muduraa kana baay'een jaalladha. Baay'ee nyaachuuf illee baay'een fedha guddaa qaba. | [ORM] suurran asii gaditti argaa jirru kun lafa gurgurtaa kuduraafi muduraadha. kuduraaf muduraa nyaataaf kan baay'ee namatti toludha. nyaachuudhaaf illee kuduraaf muduraan baay'ee filatamaadha. kanaaf kuduraaf muduraa kan baay'een jaalladha baay'ee nyaachuuf illee baay'een fedha guddaa qaba. | |
| 2 | Amharic | แฐแแฝ แคแณแธแแ แแแตแแฅ แจแฐแแซแฉ แจแญแแฝแ แญแ แแแแข แจแญแแฝ แคแถแฝแ แแแตแแฅ แจแแ แ แแต แจแฐแแซแฉ แจแ แญแ แ แแตแ แตแฃ แฅแแฐ แแแแ แ แญแฝ แ แแตแฐแญแแฃ แฅแแฐ แขแซแขแฎ แแแฎแฝแ แ แแตแซแตแฃ แแตแแณ แแญ แ แแแ แแฃ แ แแแ แแ แ แแคแฑ แแ แต แญแฐแฃแแข | [AMH] แฐแฐแแฝ แคแณแธแแ แแแตแแฅ แจแฐแแซแฉ แจแญแแฝแ แญแ แแแแข แจแญแแฝ แคแถแฝแ แแแตแแฅ แจแแ แ แแต แจแฐแแซแจ แ แญแฝ แ แแตแ แต แฅแแฐ แแแแ แ แญแฝ แ แแตแฐแจแ แฅแแฐ แขแซ แขแฎ แแแฎแฝแ แ แแตแซแต แแตแแณ แแญ แ แแแ แ แ แแแ แแ แ แแคแฑ แแ แต แญแฐแฃแแข | |
| 3 | Wolaytta | Issi heeraani asaa naatussi dumma dumma ayfiyan be'iyobati de'oososna. Hegeetuppekka meretaanne merettaa heeraara gayttiyabata be'iyaba gidikko issi heeraani mitatta woykko dumma dumma adil'e ciishshata be'iyode keehippe lo'oosonanne ubbakka ufaysoosona. | [WAL] issi heeran asaa naatussi dumma dumma ayfiyan be'iyobati de'oosona. hegeetuppekka meretaanne meretaa heeraara gayttiyaabata be'iyaaba gidikko issi heeran mittata woykko dumma dumma adile ciishshabata be'iyode keehippe lo'oosonanne ubbakka ufayssoosona. | |
| 4 | Tigrinya | แฅแแญ แฅแแญแฅแฎ แแแ แแตแ แแญ แฐแแซ แแตแ แฅแแตแจแแ แฃแฅ แฃแแ แแตแแ แแญแ แญ แฅแแ แฅแฉแข แแญแแแณแ แฅแแแ แฃแ แแ แแฐแต แแแ แแญแฉ แแซแจแต แฅแฉแข แซแฅ แแฐแแแแจ แแญแแต แแฐแซแต แแแแญแตแณแต แฅแแฅแซแฅ แแแแ แแฅแช แฃแ แแแ แแญแแแณแแ แแฎแ แฅแแ แฅแฉแข | [TIR] แฅแ แฅแแชแฆ แแแ แแตแ แแญ แฐแแซ แแตแ แฅแแตแธแแ แฃแฅ แ แแ แแตแแ แแฝแ แญ แ แแ แฅแฉแข แแญแแแณแ แ แแแ แฃแ แแแ แแฐแต แแแ แแซแจแต แฅแฉแข แซแฅ แแฐแแแแฉ แแญแแต แแฐแซแต แแแจแญแตแณแต แฅแแฅแซแฅ แแแแ แแฅแญ แฃแ แแแ แแญแแแณแแ แแฎแ แ แแ แฅแฉแข | |
| 5 | Oromo | Fakkii kanarraa kan arginu hangafa Oromoo kan taโe godina Booranaa keessatti uffata naannoo godina Booranaatiin faayamanii abbootiin bokkuu isaanii qabatanii, haadholiin immoo siinqee isaanii qabatanii kan dhaabbachaa jiranidha. | [ORM] fakkii kanarraa kan arginu hangafa oromoo kan ta'e godina booranaa keessatti uffata naannoo godina booranaatiin faayamanii abbootiin bokkuu isaanii qabatanii haadholiin immoo siiqee isaanii qabatanii kan dhaabbachaa jiranidha. | |
| 6 | Sidaama | Daganna maate yaa mayyaate? Daganna maate yee su'ma fushshihu ayeeti? Daganna maate hiiko heedhanno? Daganna maate yinihu mayiraati? Daganna maate ayira horo uyitanno? | [SID] daganna maate yaa mayyaate? daganna maate yee su'ma fushshihu ayeeti? daganna maate hiikko heedhanno? daganna maate yinihu mayraati? daganna maate ayira horo uyitanno |
from transformers import AutoModelForCTC, AutoProcessor
import torchaudio, torch
processor = AutoProcessor.from_pretrained("badrex/Ethio-ASR-multilingual-600M")
model = AutoModelForCTC.from_pretrained("badrex/Ethio-ASR-multilingual-600M")
audio, sr = torchaudio.load("audio.wav")
inputs = processor(audio.squeeze(), sampling_rate=sr, return_tensors="pt")
with torch.no_grad():
logits = model(**inputs).logits
pred_ids = torch.argmax(logits, dim=-1)
transcription = processor.batch_decode(pred_ids)[0]
print(transcription)
Performance might vary across dialects, genders, ages, and recording quality.
@misc{ethio_asr_2026,
author = {
Abdullah, Badr M. and
Azime, Israel Abebe and
Tonja, Atnafu Lambebo and
Alabi, Jesujoba O. and
Alemu, Abel Mulat and
Hagos, Eyob G. and
Balcha, Bontu Fufa and
Nerea, Mulubrhan A. and
Yadeta, Debela Desalegn and
Marilign, Dagnachew Mekonnen and
Fentahun, Amanuel Temesgen and
Kebede, Tadesse and
Gebru, Israel D. and
Woldeyohannis, Michael Melese and
Sewunetie, Walelign Tewabe and
Mรถbius, Bernd and
Klakow, Dietrich
},
title = {Ethio-ASR: Joint Multilingual Speech Recognition and Language Identification for Ethiopian Languages},
year = {2026},
howpublished = {\url{https://huggingface.co/badrex/Ethio-ASR-multilingual-600M}}
}
Base model
facebook/w2v-bert-2.0