--- language: - ak license: cc-by-nc-4.0 base_model: katrintomanek/whisper-large-v3-turbo_Akan_standardspeech_specaugment tags: - whisper - lora - asante-twi - akan - speech-recognition - peft datasets: - mozilla-foundation/common_voice_11_0 - michsethowusu/twi_multispeaker_audio_transcribed --- # Whisper Large v3 Turbo — Asante Twi LoRA Adapter (R14) Fine-tuned LoRA adapter for Asante Twi automatic speech recognition, built on top of `katrintomanek/whisper-large-v3-turbo_Akan_standardspeech_specaugment`. **WER: 17.5%** on LVP held-out eval set (Pilot-ready threshold: <22%) ## Training Data | Dataset | Role | Notes | |---|---|---| | LVP real recordings (private) | Training + eval | Collected via Rootal Audio Annotation Platform @rootal.ai; available on request | | LVP synthetic QA (private) | Training | TTS-generated Twi Q&A pairs | | Common Voice Akan | Training | Mozilla CC0 | | Financial Inclusion Speech Dataset (Ashesi) | Training (200 samples) | See citation below | | michsethowusu/twi_multispeaker_audio_transcribed | Eval-only diagnostic | Excluded from training — transcription style mismatch | ## Training Configuration - **Base model**: `katrintomanek/whisper-large-v3-turbo_Akan_standardspeech_specaugment` - **LoRA**: rank=32, alpha=64, targets: q/k/v/out_proj + fc1/fc2 - **Language**: `None` (Twi not in Whisper vocab — no language prefix token) - **Anti-hallucination**: `condition_on_prev_tokens=False`, `repetition_penalty=1.2` - **Quantization**: 8-bit (BitsAndBytes) ## Citation If you use this adapter, please cite: ```bibtex @misc{aguyatimothy2025asantetwi, author = {Timothy Aguya, Akasiya}, title = {Whisper Large v3 Turbo — Asante Twi LoRA Adapter}, year = {2026}, publisher = {HuggingFace}, url = {https://huggingface.co/rootabytes/whisper-large-v3-turbo-asante-twi-lvp} } @misc{financialinclusion2022, author = {Asamoah Owusu, D. and Korsah, A. and Quartey, B. and Nwolley Jnr., S. and Sampah, D. and Adjepon-Yamoah, D. and Omane Boateng, L.}, title = {Financial Inclusion Speech Dataset}, year = {2022}, publisher = {Ashesi University and Nokwary Technologies}, url = {https://github.com/Ashesi-Org/Financial-Inclusion-Speech-Dataset} } @inproceedings{ardila2020common, title = {Common Voice: A Massively-Multilingual Speech Corpus}, author = {Ardila, Rosana and others}, booktitle = {LREC}, year = {2020} } @article{radford2022robust, title = {Robust Speech Recognition via Large-Scale Weak Supervision}, author = {Radford, Alec and others}, journal = {arXiv:2212.04356}, year = {2022} } @article{hu2021lora, title = {LoRA: Low-Rank Adaptation of Large Language Models}, author = {Hu, Edward J and others}, journal = {arXiv:2106.09685}, year = {2021} }