Hexa TTS 5B

Hexa TTS is a massive 5-billion parameter Text-to-Speech system designed for high-fidelity, emotional, and multi-speaker speech synthesis across 15 languages.

Model Details

  • Architecture: Transformer-based (Flow Matching / Autoregressive)
  • Parameters: ~4.92 Billion
  • Languages: 15 (English, Bangla, Chinese, Spanish, French, German, Japanese, Korean, Russian, Portuguese, Italian, Hindi, Arabic, Turkish, Dutch)
  • Features: Emotion Control, Multi-Speaker embedding.

Usage

This repository contains the untrained model definition and configuration.

# Coming soon: AutoModel support
from src.hf_model import HexaModel
from src.config import HexaConfig

config = HexaConfig() # Defaults to 5B
model = HexaModel(config)

Training

To train this model, use the provided src/train_hf.py script on a multi-GPU cluster (A100/H100 recommended).

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Space using Hexa09/hexa-tts-5b 1