Instructions to use rootabytes/Rootal-Twi-ASR with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use rootabytes/Rootal-Twi-ASR with PEFT:
from peft import PeftModel from transformers import AutoModelForSeq2SeqLM base_model = AutoModelForSeq2SeqLM.from_pretrained("katrintomanek/whisper-large-v3-turbo_Akan_standardspeech_specaugment") model = PeftModel.from_pretrained(base_model, "rootabytes/Rootal-Twi-ASR") - Notebooks
- Google Colab
- Kaggle
File size: 2,836 Bytes
6a36d08 3f6434c 253b519 3f6434c 6a36d08 3f6434c 253b519 3f6434c b1947a8 3f6434c 9dcc4c1 3f6434c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 | ---
language:
- ak
license: cc-by-nc-4.0
base_model: katrintomanek/whisper-large-v3-turbo_Akan_standardspeech_specaugment
tags:
- whisper
- lora
- asante-twi
- akan
- speech-recognition
- peft
datasets:
- mozilla-foundation/common_voice_11_0
- michsethowusu/twi_multispeaker_audio_transcribed
---
# Whisper Large v3 Turbo — Asante Twi LoRA Adapter (R14)
Fine-tuned LoRA adapter for Asante Twi automatic speech recognition, built on top of
`katrintomanek/whisper-large-v3-turbo_Akan_standardspeech_specaugment`.
**WER: 17.5%** on LVP held-out eval set (Pilot-ready threshold: <22%)
## Training Data
| Dataset | Role | Notes |
|---|---|---|
| LVP real recordings (private) | Training + eval | Collected via Rootal Audio Annotation Platform @rootal.ai; available on request |
| LVP synthetic QA (private) | Training | TTS-generated Twi Q&A pairs |
| Common Voice Akan | Training | Mozilla CC0 |
| Financial Inclusion Speech Dataset (Ashesi) | Training (200 samples) | See citation below |
| michsethowusu/twi_multispeaker_audio_transcribed | Eval-only diagnostic | Excluded from training — transcription style mismatch |
## Training Configuration
- **Base model**: `katrintomanek/whisper-large-v3-turbo_Akan_standardspeech_specaugment`
- **LoRA**: rank=32, alpha=64, targets: q/k/v/out_proj + fc1/fc2
- **Language**: `None` (Twi not in Whisper vocab — no language prefix token)
- **Anti-hallucination**: `condition_on_prev_tokens=False`, `repetition_penalty=1.2`
- **Quantization**: 8-bit (BitsAndBytes)
## Citation
If you use this adapter, please cite:
```bibtex
@misc{aguyatimothy2025asantetwi,
author = {Timothy Aguya, Akasiya},
title = {Whisper Large v3 Turbo — Asante Twi LoRA Adapter},
year = {2026},
publisher = {HuggingFace},
url = {https://huggingface.co/rootabytes/whisper-large-v3-turbo-asante-twi-lvp}
}
@misc{financialinclusion2022,
author = {Asamoah Owusu, D. and Korsah, A. and Quartey, B. and Nwolley Jnr., S.
and Sampah, D. and Adjepon-Yamoah, D. and Omane Boateng, L.},
title = {Financial Inclusion Speech Dataset},
year = {2022},
publisher = {Ashesi University and Nokwary Technologies},
url = {https://github.com/Ashesi-Org/Financial-Inclusion-Speech-Dataset}
}
@inproceedings{ardila2020common,
title = {Common Voice: A Massively-Multilingual Speech Corpus},
author = {Ardila, Rosana and others},
booktitle = {LREC},
year = {2020}
}
@article{radford2022robust,
title = {Robust Speech Recognition via Large-Scale Weak Supervision},
author = {Radford, Alec and others},
journal = {arXiv:2212.04356},
year = {2022}
}
@article{hu2021lora,
title = {LoRA: Low-Rank Adaptation of Large Language Models},
author = {Hu, Edward J and others},
journal = {arXiv:2106.09685},
year = {2021}
} |