mlx-community
/

supertonic-2

Model card Files Files and versions

supertonic-2 / README.md

roboalchemist's picture

Upload folder using huggingface_hub

82c6ce4 verified about 2 months ago

|

history blame contribute delete

1 kB

	# mlx-community/supertonic-2

	This model was converted to MLX format from [`Supertone/supertonic-2`](https://huggingface.co/Supertone/supertonic-2) using mlx-audio version 0.2.8.

	SuperTonic 2 is a high-quality text-to-speech model with voice style control.

	## Use with mlx-audio

	```bash
	pip install -U mlx-audio
	```

	### CLI Example:
	```bash
	mlx_audio.tts.generate --model mlx-community/supertonic-2 --text "Hello, this is a test." --voice M1
	```

	### Python Example:
	```python
	from mlx_audio.tts.utils import load_model

	model = load_model("mlx-community/supertonic-2")
	for result in model.generate("Hello, this is a test.", voice="M1"):
	print(f"Generated {result.audio_duration} of audio")
	```

	## Model Details

	- Architecture: Text encoder + Duration predictor + Flow matching (vector field) + Vocoder
	- Sample rate: 44100 Hz
	- Voices: M1-M5, F1-F5 (10 built-in voice styles)
	- Latent dim: 24 (compressed to 144 via chunking)
	- Flow matching steps: 10 (configurable)