Text-to-Speech
MLX
Supertonic
supertonic-3-mlx
supertonic-3
apple-silicon
tts
speech-synthesis
multilingual
flow-matching
Instructions to use ambassadia/supertonic-3-mlx with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use ambassadia/supertonic-3-mlx with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir supertonic-3-mlx ambassadia/supertonic-3-mlx
- Supertonic
How to use ambassadia/supertonic-3-mlx with Supertonic:
from supertonic import TTS tts = TTS(auto_download=True) style = tts.get_voice_style(voice_name="M1") text = "The train delay was announced at 4:45 PM on Wed, Apr 3, 2024 due to track maintenance." wav, duration = tts.synthesize(text, voice_style=style) tts.save_audio(wav, "output.wav")
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
File size: 856 Bytes
69dee76 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | """Minimal Supertonic 3 MLX usage — 5 lines, no fluff.
Run from anywhere AFTER ``pip install supertonic-3-mlx`` (or from inside
this directory after ``pip install ./``):
python examples/quickstart.py
"""
from supertonic_3_mlx import Pipeline
import soundfile as sf
# When the package has been pip-installed, this auto-downloads from the Hub
# (~ 400 MB) into the standard Hugging Face cache. After the first run, the
# weights are reused from cache and cold start is ~ 11 ms on M4.
pipe = Pipeline.from_pretrained("ambassadia/supertonic-3-mlx")
wav = pipe.generate(
"Hello world from Apple Silicon. Supertonic 3 runs at one hundred times realtime.",
voice="F1", # one of F1..F5, M1..M5
lang="en", # ISO 639-1
)
sf.write("hello.wav", wav, pipe.sample_rate)
print(f"wrote hello.wav — {len(wav) / pipe.sample_rate:.2f}s of audio")
|