malian-tts / README.md
sudoping01's picture
Update README.md
43d58c8 verified
metadata
library_name: transformers
base_model: facebook/mms-tts
tags:
  - text-to-speech
  - vits
  - mms
  - multilingual
  - Open-Source
  - Mali
  - MALIBA-AI
language:
  - bm
  - son
  - dgc
  - fuf
  - bbo
  - tmh
language_bcp47:
  - bm-ML
  - son-ML
  - dgc-ML
  - fuf-ML
  - bbo-ML
  - tmh-ML
model-index:
  - name: malian-tts
    results:
      - task:
          name: text-to-speech
          type: speech-synthesis
        metrics:
          - name: Subjective Quality
            type: MOS
            value: N/A
pipeline_tag: text-to-speech
license: cc-by-nc-4.0

MALIBA-TTS: Revolutionizing Speech Synthesis for Malian Languages 🇲🇱

MALIBA-TTS represents a breakthrough in African language technology, offering high-quality text-to-speech synthesis for six Malian languages. These models bridge a critical gap in speech technology, bringing voice synthesis capabilities to languages spoken by millions yet historically underserved by technology.

Try It Out

Experience MALIBA-TTS directly in your browser: Live Demo on Hugging Face Spaces

Bridging the Digital Language Divide

Despite being spoken by over 20 million people combined, Malian languages have remained severely underrepresented in speech technology. MALIBA-TTS directly addresses this critical gap, making digital speech interfaces accessible to speakers of Bambara, Boomu, Dogon, Pular, Songhoy, and Tamasheq for the first time. This work represents a crucial step toward digital language equality.

Table of Contents

Technical Specifications

Model Specifications

  • Architecture: VITS (Variational Inference with adversarial learning for end-to-end TTS)
  • Base Model: Meta's MMS (Massively Multilingual Speech)
  • Model Size: 145 MB per language
  • Format: PyTorch
  • Sampling Rate: 16kHz
  • Audio Encoding: 16-bit PCM
  • Languages: Bambara, Boomu, Dogon, Pular, Songhoy, and Tamasheq

Performance

  • Inference: Optimized to run on CPU
  • Real-time Capability: Generates speech with minimal latency
  • Memory Footprint: ~4GB RAM recommended for optimal performance
  • Deployment Flexibility: Works on standard hardware without specialized accelerators

Transforming Access to Technology in Mali

MALIBA-TTS enables numerous applications previously unavailable to speakers of Malian languages:

  • Education: Audio-based learning tools for literacy and education in mother tongues
  • Accessibility: Making digital content accessible to visually impaired users
  • Healthcare: Voice interfaces for health information in local languages
  • Cultural Preservation: Digital narration of stories and cultural heritage
  • Mobile Access: Voice responses for smartphone users with limited literacy
  • Public Service: Automated voice announcements and information systems

Installation

pip install transformers torch soundfile

Usage

import torch
import soundfile as sf
from transformers import VitsModel, AutoTokenizer

# Available languages: bambara, boomu, dogon, pular, songhoy, tamasheq
language = "bambara"
model_id = "MALIBA-AI/malian-tts"

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id, subfolder=f"models/{language}")
model = VitsModel.from_pretrained(model_id, subfolder=f"models/{language}")

# Set device
device = "cuda" if torch.cuda.is_available() else "cpu"
model = model.to(device)

# Synthesize speech
text = "An filɛ ni ye yɔrɔ minna ni an ye an sigi ka a layɛ yala an bɛ ka baara min kɛ ɛsike a kɛlen don ka Ɲɛ wa ?"
inputs = tokenizer(text, return_tensors="pt").to(device)

with torch.no_grad():
    output = model(**inputs).waveform

waveform = output.squeeze().cpu().numpy()
sample_rate = model.config.sampling_rate

# Save to file
sf.write("output.wav", waveform, sample_rate)

Language Examples

# Bambara
text = "An filɛ ni ye yɔrɔ minna ni an ye an sigi ka a layɛ yala an bɛ ka baara min kɛ ɛsike a kɛlen don ka Ɲɛ wa ?"

# Boomu
text = "Vunurobe wozomɛ pɛɛ, Poli we zo woro han Deeɓenu wara li Deeɓenu faralo zuun. Lo we baba a lo wara yi see ɓa Zuwifera ma ɓa Gɛrɛkela wa."

# Dogon
text = "Pɔɔlɔ, kubɔ lugo joo le, bana dɛin dɛin le, inɛw Ama titiyaanw le digɛu, Ama, emɛ babe bɛrɛ sɔɔ sɔi."

# Pular
text = "Miɗo ndaarde saabe Laamɗo e saabe Iisaa Almasiihu caroyoowo wuurɓe e maayɓe oo, miɗo ndaardire saabe gartol makko ka num e Laamu makko"

# Songhoy
text = "Haya ka se beenediyo kokoyteraydi go hima nda huukoy foo ka fatta ja subaahi ka taasi goykoyyo ngu rezẽ faridi se"

# Tamasheq
text = "Toḍă tăfukt ɣas, issăɣră-dd măssi-s n-ašĕkrĕš ănaẓraf-net, inn'-as: 'Ǝɣĕr-dd inaxdimăn, tĕẓlĕd-asăn, sănt s-wi dd-ĕšrăynen har tĕkkĕd wi dd-ăzzarnen."

The MALIBA-AI Impact

MALIBA-TTS is part of MALIBA-AI's broader mission to ensure "No Malian Language Left Behind." This initiative is actively transforming Mali's digital landscape by:

  1. Breaking Language Barriers: Providing technology in languages that Malians actually speak
  2. Enabling Local Innovation: Allowing Malian developers to build voice-based applications
  3. Preserving Cultural Heritage: Digitizing and preserving Mali's rich oral traditions
  4. Democratizing AI: Making cutting-edge technology accessible to all Malians regardless of literacy level
  5. Building Local Expertise: Training Malian AI practitioners and researchers

Limitations

[coming soon]

Future Development

MALIBA-AI is committed to continuing this work with:

  • Expansion to more Malian languages and dialects

References

@misc{malian-tts,
  author = {MALIBA-AI},
  title = {Text-to-Speech Models for Six Malian Languages},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/MALIBA-AI/malian-tts}}
}
@article{kim2021conditional,
  title={Conditional variational autoencoder with adversarial learning for end-to-end text-to-speech},
  author={Kim, Jaehyeon and Kong, Jungil and Son, Juhee},
  journal={International Conference on Machine Learning},
  year={2021}
}
@article{meta2023mms,
  title={Scaling Speech Technology to 1,000+ Languages},
  author={A. Pratap and others},
  journal={arXiv preprint arXiv:2305.13516},
  year={2023}
}

License

This project is licensed under CC BY-NC 4.0 (Attribution-NonCommercial).

Terms of Use

  • Users agree to use the model in a way that respects Malian languages and culture
  • We encourage the use of these models to develop solutions that improve digital accessibility for speakers of Malian languages
  • Any use of the models must acknowledge MALIBA-AI and Meta
  • Commercial usage is not allowed

Contributing

MALIBA-TTS is a project part of the MALIBA-AI initiative with the mission "No Malian Language Left Behind." We welcome contributions from:

  • Language Experts: To improve the quality and accuracy of the models
  • Developers: To create applications using these models
  • Researchers: To explore technical improvements and optimizations
  • Data Contributors: To enrich tts training data
  • Community Members: To provide feedback and testing across dialects

To contribute, please visit MALIBA-AI or contact [coming soon] directly.


MALIBA-AI: Empowering Mali's Future Through Community-Driven AI Innovation

"No Malian Language Left Behind"