Hexa TTS 5B
Hexa TTS is a massive 5-billion parameter Text-to-Speech system designed for high-fidelity, emotional, and multi-speaker speech synthesis across 15 languages.
Model Details
- Architecture: Transformer-based (Flow Matching / Autoregressive)
- Parameters: ~4.92 Billion
- Languages: 15 (English, Bangla, Chinese, Spanish, French, German, Japanese, Korean, Russian, Portuguese, Italian, Hindi, Arabic, Turkish, Dutch)
- Features: Emotion Control, Multi-Speaker embedding.
Usage
This repository contains the untrained model definition and configuration.
# Coming soon: AutoModel support
from src.hf_model import HexaModel
from src.config import HexaConfig
config = HexaConfig() # Defaults to 5B
model = HexaModel(config)
Training
To train this model, use the provided src/train_hf.py script on a multi-GPU cluster (A100/H100 recommended).