CovianHive
/

next_bemba_ai

automatic-speech-recognition

Model card Files Files and versions

next_bemba_ai / README.md

Silas Chalwe

Update README.md

2a22d39 verified 9 months ago

|

history blame contribute delete

3.38 kB

	---
	library_name: transformers
	pipeline_tag: text-to-audio
	---
	<p align="center">
	<img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" height="80" />
	</p>

	# 🫢 NextInnoMind / next\_bemba\_ai

	Bemba Whisper ASR (Automatic Speech Recognition)
	Fine-tuned Whisper model for the Bemba language only.
	Developed and maintained by NextInnoMind, led by Chalwe Silas.

	---

	### 🧪 Model Type

	`WhisperForConditionalGeneration` — fine-tuned using [openai/whisper-small](https://huggingface.co/openai/whisper-small)
	Framework: `Transformers`
	Checkpoint Format: `Safetensors`
	Languages: `Bemba`

	---

	## 📜 Model Description

	This model is a Whisper Small variant fine-tuned exclusively for Bemba, a major Zambian language. It is designed to enhance local language ASR performance and promote indigenous language technology.

	---

	## 📚 Training Details

	* Base Model: [`openai/whisper-small`](https://huggingface.co/openai/whisper-small)
	* Dataset:

	* BembaSpeech (curated dataset of Bemba audio + transcripts)
	* Training Time: 8 epochs (\~45 hours on A100 GPU)
	* Learning Rate: 1e-5
	* Batch Size: 16
	* Framework: Transformers + Accelerate
	* Tokenizer: WhisperProcessor with `task="transcribe"` (no language token used)

	---

	## 🚀 Usage

	```python
	from transformers import pipeline

	pipe = pipeline(
	"automatic-speech-recognition",
	model="NextInnoMind/next_bemba_ai",
	chunk_length_s=30,
	return_timestamps=True
	)

	# Example
	result = pipe("path_to_audio.wav")
	print(result["text"])
	```

	> 📌 Tip: No language token is required. The model is fine-tuned for Bemba only.

	---

	## 🔍 Applications

	* Education: Local language transcriptions and learning tools
	* Broadcast & Media: Transcribe Bemba radio and TV shows
	* Research: Bantu language documentation and analysis
	* Accessibility: Voice-to-text systems in local apps and platforms

	---

	## ⚠️ Limitations & Biases

	* Trained only on Bemba: does not support English or other languages.
	* Accuracy may drop with heavy background noise or strong dialectal variation.
	* Not optimized for code-switching or informal speech styles.

	---

	## 📊 Evaluation

	\| Language \| WER (Word Error Rate) \| Dataset \|
	\| -------- \| --------------------- \| -------------------- \|
	\| Bemba \| \~16.7% \| BembaSpeech Eval Set \|

	---

	## 🌱 Environmental Impact

	* Hardware: A100 40GB x1
	* Training Time: \~45 hours
	* Carbon Emissions: Estimated \~20.4 kg CO₂
	(via [ML CO2 Impact](https://mlco2.github.io/impact))

	---

	## 📄 Citation

	```bibtex
	@misc{nextbembaai2025,
	title={NextInnoMind next_bemba_ai: Whisper-based ASR model for Bemba},
	author={Silas Chalwe and NextInnoMind},
	year={2025},
	howpublished={\url{https://huggingface.co/NextInnoMind/next_bemba_ai}},
	}
	```

	---

	## 🧑‍💻 Maintainers

	* Chalwe Silas (Lead Developer & Dataset Curator)
	* Team NextInnoMind

	📬 Contact:

	* [silaschalwe@outlook.com](mailto:silaschalwe@outlook.com)
	* [mchalwesilas@gmail.com](mailto:mchalwesilas@gmail.com)

	🔗 GitHub: [SilasChalwe](https://github.com/SilasChalwe)

	---

	## 📌 Related Resources

	* [BembaSpeech Dataset](https://huggingface.co/datasets/NextInnoMind/BembaSpeech)
	* [NextInnoMind on GitHub](https://github.com/SilasChalwe)

	---

	Fine tuned in Zambia.