---
library_name: transformers
pipeline_tag: text-to-audio
---
# ๐ซข NextInnoMind / next\_bemba\_ai
**Bemba Whisper ASR (Automatic Speech Recognition)**
Fine-tuned Whisper model for the Bemba language only.
Developed and maintained by **NextInnoMind**, led by **Chalwe Silas**.
---
### ๐งช Model Type
`WhisperForConditionalGeneration` โ fine-tuned using [openai/whisper-small](https://huggingface.co/openai/whisper-small)
Framework: `Transformers`
Checkpoint Format: `Safetensors`
Languages: `Bemba`
---
## ๐ Model Description
This model is a Whisper Small variant fine-tuned exclusively for **Bemba**, a major Zambian language. It is designed to enhance local language ASR performance and promote indigenous language technology.
---
## ๐ Training Details
* **Base Model**: [`openai/whisper-small`](https://huggingface.co/openai/whisper-small)
* **Dataset**:
* BembaSpeech (curated dataset of Bemba audio + transcripts)
* **Training Time**: 8 epochs (\~45 hours on A100 GPU)
* **Learning Rate**: 1e-5
* **Batch Size**: 16
* **Framework**: Transformers + Accelerate
* **Tokenizer**: WhisperProcessor with `task="transcribe"` (no language token used)
---
## ๐ Usage
```python
from transformers import pipeline
pipe = pipeline(
"automatic-speech-recognition",
model="NextInnoMind/next_bemba_ai",
chunk_length_s=30,
return_timestamps=True
)
# Example
result = pipe("path_to_audio.wav")
print(result["text"])
```
> ๐ Tip: No language token is required. The model is fine-tuned for Bemba only.
---
## ๐ Applications
* **Education**: Local language transcriptions and learning tools
* **Broadcast & Media**: Transcribe Bemba radio and TV shows
* **Research**: Bantu language documentation and analysis
* **Accessibility**: Voice-to-text systems in local apps and platforms
---
## โ ๏ธ Limitations & Biases
* Trained only on Bemba: does not support English or other languages.
* Accuracy may drop with heavy background noise or strong dialectal variation.
* Not optimized for code-switching or informal speech styles.
---
## ๐ Evaluation
| Language | WER (Word Error Rate) | Dataset |
| -------- | --------------------- | -------------------- |
| Bemba | \~16.7% | BembaSpeech Eval Set |
---
## ๐ฑ Environmental Impact
* **Hardware**: A100 40GB x1
* **Training Time**: \~45 hours
* **Carbon Emissions**: Estimated \~20.4 kg COโ
*(via [ML CO2 Impact](https://mlco2.github.io/impact))*
---
## ๐ Citation
```bibtex
@misc{nextbembaai2025,
title={NextInnoMind next_bemba_ai: Whisper-based ASR model for Bemba},
author={Silas Chalwe and NextInnoMind},
year={2025},
howpublished={\url{https://huggingface.co/NextInnoMind/next_bemba_ai}},
}
```
---
## ๐งโ๐ป Maintainers
* **Chalwe Silas** (Lead Developer & Dataset Curator)
* Team **NextInnoMind**
๐ฌ Contact:
* [silaschalwe@outlook.com](mailto:silaschalwe@outlook.com)
* [mchalwesilas@gmail.com](mailto:mchalwesilas@gmail.com)
๐ GitHub: [SilasChalwe](https://github.com/SilasChalwe)
---
## ๐ Related Resources
* [BembaSpeech Dataset](https://huggingface.co/datasets/NextInnoMind/BembaSpeech)
* [NextInnoMind on GitHub](https://github.com/SilasChalwe)
---
Fine tuned in Zambia.