PEFT
Safetensors
Akan
whisper
lora
asante-twi
akan
speech-recognition
File size: 2,836 Bytes
6a36d08
3f6434c
 
253b519
3f6434c
 
 
 
 
 
 
 
 
 
 
6a36d08
 
3f6434c
 
 
 
 
 
 
 
 
 
 
253b519
3f6434c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b1947a8
3f6434c
9dcc4c1
3f6434c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
---
language:
- ak
license: cc-by-nc-4.0
base_model: katrintomanek/whisper-large-v3-turbo_Akan_standardspeech_specaugment
tags:
- whisper
- lora
- asante-twi
- akan
- speech-recognition
- peft
datasets:
- mozilla-foundation/common_voice_11_0
- michsethowusu/twi_multispeaker_audio_transcribed
---

# Whisper Large v3 Turbo — Asante Twi LoRA Adapter (R14)

Fine-tuned LoRA adapter for Asante Twi automatic speech recognition, built on top of
`katrintomanek/whisper-large-v3-turbo_Akan_standardspeech_specaugment`.

**WER: 17.5%** on LVP held-out eval set (Pilot-ready threshold: <22%)

## Training Data

| Dataset | Role | Notes |
|---|---|---|
| LVP real recordings (private) | Training + eval | Collected via Rootal Audio Annotation Platform @rootal.ai; available on request |
| LVP synthetic QA (private) | Training | TTS-generated Twi Q&A pairs |
| Common Voice Akan | Training | Mozilla CC0 |
| Financial Inclusion Speech Dataset (Ashesi) | Training (200 samples) | See citation below |
| michsethowusu/twi_multispeaker_audio_transcribed | Eval-only diagnostic | Excluded from training — transcription style mismatch |

## Training Configuration

- **Base model**: `katrintomanek/whisper-large-v3-turbo_Akan_standardspeech_specaugment`
- **LoRA**: rank=32, alpha=64, targets: q/k/v/out_proj + fc1/fc2
- **Language**: `None` (Twi not in Whisper vocab — no language prefix token)
- **Anti-hallucination**: `condition_on_prev_tokens=False`, `repetition_penalty=1.2`
- **Quantization**: 8-bit (BitsAndBytes)

## Citation

If you use this adapter, please cite:

```bibtex
@misc{aguyatimothy2025asantetwi,
  author    = {Timothy Aguya, Akasiya},
  title     = {Whisper Large v3 Turbo — Asante Twi LoRA Adapter},
  year      = {2026},
  publisher = {HuggingFace},
  url       = {https://huggingface.co/rootabytes/whisper-large-v3-turbo-asante-twi-lvp}
}

@misc{financialinclusion2022,
  author    = {Asamoah Owusu, D. and Korsah, A. and Quartey, B. and Nwolley Jnr., S.
               and Sampah, D. and Adjepon-Yamoah, D. and Omane Boateng, L.},
  title     = {Financial Inclusion Speech Dataset},
  year      = {2022},
  publisher = {Ashesi University and Nokwary Technologies},
  url       = {https://github.com/Ashesi-Org/Financial-Inclusion-Speech-Dataset}
}

@inproceedings{ardila2020common,
  title     = {Common Voice: A Massively-Multilingual Speech Corpus},
  author    = {Ardila, Rosana and others},
  booktitle = {LREC},
  year      = {2020}
}

@article{radford2022robust,
  title   = {Robust Speech Recognition via Large-Scale Weak Supervision},
  author  = {Radford, Alec and others},
  journal = {arXiv:2212.04356},
  year    = {2022}
}

@article{hu2021lora,
  title   = {LoRA: Low-Rank Adaptation of Large Language Models},
  author  = {Hu, Edward J and others},
  journal = {arXiv:2106.09685},
  year    = {2021}
}