File size: 1,002 Bytes
82c6ce4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# mlx-community/supertonic-2

This model was converted to MLX format from [`Supertone/supertonic-2`](https://huggingface.co/Supertone/supertonic-2) using mlx-audio version **0.2.8**.

SuperTonic 2 is a high-quality text-to-speech model with voice style control.

## Use with mlx-audio

```bash
pip install -U mlx-audio
```

### CLI Example:
```bash
mlx_audio.tts.generate --model mlx-community/supertonic-2 --text "Hello, this is a test." --voice M1
```

### Python Example:
```python
from mlx_audio.tts.utils import load_model

model = load_model("mlx-community/supertonic-2")
for result in model.generate("Hello, this is a test.", voice="M1"):
    print(f"Generated {result.audio_duration} of audio")
```

## Model Details

- **Architecture**: Text encoder + Duration predictor + Flow matching (vector field) + Vocoder
- **Sample rate**: 44100 Hz
- **Voices**: M1-M5, F1-F5 (10 built-in voice styles)
- **Latent dim**: 24 (compressed to 144 via chunking)
- **Flow matching steps**: 10 (configurable)