supertonic-2 / README.md
roboalchemist's picture
Upload folder using huggingface_hub
82c6ce4 verified
# mlx-community/supertonic-2
This model was converted to MLX format from [`Supertone/supertonic-2`](https://huggingface.co/Supertone/supertonic-2) using mlx-audio version **0.2.8**.
SuperTonic 2 is a high-quality text-to-speech model with voice style control.
## Use with mlx-audio
```bash
pip install -U mlx-audio
```
### CLI Example:
```bash
mlx_audio.tts.generate --model mlx-community/supertonic-2 --text "Hello, this is a test." --voice M1
```
### Python Example:
```python
from mlx_audio.tts.utils import load_model
model = load_model("mlx-community/supertonic-2")
for result in model.generate("Hello, this is a test.", voice="M1"):
print(f"Generated {result.audio_duration} of audio")
```
## Model Details
- **Architecture**: Text encoder + Duration predictor + Flow matching (vector field) + Vocoder
- **Sample rate**: 44100 Hz
- **Voices**: M1-M5, F1-F5 (10 built-in voice styles)
- **Latent dim**: 24 (compressed to 144 via chunking)
- **Flow matching steps**: 10 (configurable)