Bambara TTS

Text-to-speech synthesis model for Bambara (Bamanankan), a language spoken by over 14 million people primarily in Mali.

Technical Specifications

Architecture: VITS (Variational Inference with adversarial learning for end-to-end TTS)
Base Model: Facebook/Meta MMS
Size: 145 MB
Format: PyTorch
Sampling Rate: 16kHz
Language: Bambara (bm-ML)
Performance: Optimized for CPU (4GB RAM recommended)

Installation

pip install transformers torch soundfile

Usage

from transformers import VitsModel, AutoTokenizer
import torch

# Load model and tokenizer
model = VitsModel.from_pretrained("sudoping01/bambara-tts")
tokenizer = AutoTokenizer.from_pretrained("sudoping01/bambara-tts")

# Prepare text and generate speech
text = "An filɛ ni ye yɔrɔ minna ni an ye an sigi ka a layɛ yala an bɛ ka baara min kɛ ɛsike a kɛlen don ka Ɲɛ wa ?"
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
    output = model(**inputs).waveform

# Save output
waveform = output.squeeze().cpu().numpy()
sample_rate = model.config.sampling_rate
import soundfile as sf
sf.write("bambara_output.wav", waveform, sample_rate)

Limitations

Limited handling of loanwords and code-switching with French
Variable performance across regional dialects
Requires standard orthography
Limited prosody and emotional expression

License

CC BY-NC 4.0 (Attribution-NonCommercial)

Non-commercial use only
Attribution required for model authors and Meta
Use must respect Bambara language and culture

References

@misc{bambara-tts,
  author = {sudoping01},
  title = {Text-to-Speech Model for Bambara},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/sudoping01/bambara-tts}}
}

Downloads last month: 30

Safetensors

Model size

36.3M params

Tensor type

F32

Model tree for sudoping01/bambara-tts

Base model

facebook/mms-tts

Finetuned

(14)

this model

Spaces using sudoping01/bambara-tts 2

Evaluation results

Subjective Quality
self-reported

N/A