metadata
license: apache-2.0
language:
- en
- hi
- bn
- te
- ta
- kn
- mr
- gu
tags:
- text-to-speech
- tts
- voice-cloning
- multilingual
library_name: coqui-tts
TruthShield VoiceGen
Model Description
TruthShield VoiceGen is a multi-speaker, multilingual text-to-speech model with accent and style transfer capabilities. Built for the Voice Tech For All Challenge, it supports 11 Indian and English languages with forensic speaker verification.
Supported Languages
- Bhojpuri, Bengali, English, Gujarati, Hindi
- Chhattisgarhi, Kannada, Magahi, Maithili, Marathi, Telugu
Model Architecture
- Core: VITS (Variational Inference TTS)
- Speaker Encoder: ECAPA-TDNN
- Vocoder: HiFiGAN
Intended Use
- Accessibility applications
- Educational content
- Regional language content creation
- Voice assistants
Limitations
- Requires speaker reference audio (WAV format)
- English text must be lowercase
- Maximum text length: 5000 characters
Ethical Considerations
- Built-in safety verification prevents unauthorized cloning
- All generated audio includes forensic watermarking
- Consent required for voice cloning
Training Data
- SYSPIN Indian Languages Dataset
- SpiCor Indian English Accents
- See datasets.csv for supplementary data
Citation
@misc{truthshield2024voicegen, title={TruthShield VoiceGen: Multi-Speaker Multilingual TTS}, author={TruthShield Team}, year={2024}, publisher={HuggingFace} }