|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
- hi |
|
|
- bn |
|
|
- te |
|
|
- ta |
|
|
- kn |
|
|
- mr |
|
|
- gu |
|
|
tags: |
|
|
- text-to-speech |
|
|
- tts |
|
|
- voice-cloning |
|
|
- multilingual |
|
|
library_name: coqui-tts |
|
|
--- |
|
|
|
|
|
# TruthShield VoiceGen |
|
|
|
|
|
## Model Description |
|
|
|
|
|
TruthShield VoiceGen is a multi-speaker, multilingual text-to-speech model with accent and style transfer capabilities. Built for the Voice Tech For All Challenge, it supports 11 Indian and English languages with forensic speaker verification. |
|
|
|
|
|
## Supported Languages |
|
|
|
|
|
- Bhojpuri, Bengali, English, Gujarati, Hindi |
|
|
- Chhattisgarhi, Kannada, Magahi, Maithili, Marathi, Telugu |
|
|
|
|
|
## Model Architecture |
|
|
|
|
|
- **Core**: VITS (Variational Inference TTS) |
|
|
- **Speaker Encoder**: ECAPA-TDNN |
|
|
- **Vocoder**: HiFiGAN |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
- Accessibility applications |
|
|
- Educational content |
|
|
- Regional language content creation |
|
|
- Voice assistants |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Requires speaker reference audio (WAV format) |
|
|
- English text must be lowercase |
|
|
- Maximum text length: 5000 characters |
|
|
|
|
|
## Ethical Considerations |
|
|
|
|
|
- Built-in safety verification prevents unauthorized cloning |
|
|
- All generated audio includes forensic watermarking |
|
|
- Consent required for voice cloning |
|
|
|
|
|
## Training Data |
|
|
|
|
|
- SYSPIN Indian Languages Dataset |
|
|
- SpiCor Indian English Accents |
|
|
- See datasets.csv for supplementary data |
|
|
|
|
|
## Citation |
|
|
|
|
|
@misc{truthshield2024voicegen, |
|
|
title={TruthShield VoiceGen: Multi-Speaker Multilingual TTS}, |
|
|
author={TruthShield Team}, |
|
|
year={2024}, |
|
|
publisher={HuggingFace} |
|
|
} |
|
|
|