--- license: apache-2.0 datasets: - Kennethdot/Ghana_English-Twi_Code-switching_ASR language: - en - tw base_model: - GiftMark/akan-whisper-model --- # English–Twi Code-Switching ASR Model - Kasanoma ## Model Overview This model is a fine-tuned Automatic Speech Recognition (ASR) system designed for English–Twi code-switching speech transcription. It is built on a pretrained Akan-adapted Whisper model and further fine-tuned on a curated bilingual dataset containing English, Twi, and mixed-language utterances. The model supports natural bilingual speech, including intra-sentential and inter-sentential code-switching. --- ## Base Model - `GiftMark/akan-whisper-model` --- ## Task - Automatic Speech Recognition (ASR) - Code-switching speech transcription - English and Twi bilingual speech recognition --- ## Dataset - `Kennethdot/Ghana_English-Twi_Code-switching_ASR` The dataset contains: - Code-switched English–Twi speech - Monolingual English and Twi speech - Read and semi-spontaneous utterances - Carefully transcribed bilingual speech with preserved linguistic structure --- ## Evaluation Setup Evaluation was performed using **Word Error Rate (WER)** without text normalization. This means: - No lowercasing - No punctuation removal - No orthographic normalization applied WER reflects raw transcription fidelity. --- ## Results | Model | CS WER | Twi WER | English WER | |------|--------|----------|--------------| | Zero-shot Akan Whisper Small | 127.08 | 116.08 | 110.26 | | Fine-tuned Model | **6.58** | 99.44 | 100.43 | --- ## Key Findings - Fine-tuning leads to a **significant improvement in code-switching ASR performance** - The model achieves strong performance on bilingual utterances after adaptation - Monolingual performance remains relatively unchanged, indicating limited cross-language transfer gain - Code-switching appears to be the most learnable and most improved component of the task --- ## Qualitative Examples The model is capable of producing fluent bilingual outputs with preserved punctuation and natural speech patterns: **Example 1** -- Twitwa enam no into small pieces for the light soup. **Example 2** -- just realized that w'abusua yɛ Ɔyoko, so you are royalty. **Example 3** -- Wo nim sɛ I almost forgot to buy the food? --- ## Limitations - Model is sensitive to orthographic variation and punctuation - Some degradation occurs on highly monolingual segments after fine-tuning - Requires further balancing of training data across languages --- ## Intended Use - Code-switching ASR research - Low-resource African language speech recognition - Bilingual speech transcription systems - Linguistic analysis of English–Twi speech patterns --- ## Ethical Considerations - The model is intended for research and educational use only - It should not be used for surveillance or unauthorized speech monitoring - Bias may exist due to dataset imbalance between languages