MLX Speech Models
Collection
Speech AI models for Apple Silicon via MLX. ASR, TTS, VAD, diarization, speaker embedding. • 77 items • Updated • 5
How to use aufklarer/Fish-Audio-S2-Pro-MLX-fp16 with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir Fish-Audio-S2-Pro-MLX-fp16 aufklarer/Fish-Audio-S2-Pro-MLX-fp16
This is an fp16 artifact export of fishaudio/s2-pro for Apple Silicon runtime evaluation.
Runtime support is not implied by this model card. The bundle preserves upstream
key names and is intended for a speech-swift Swift/MLX port.
| Field | Value |
|---|---|
| Source | fishaudio/s2-pro |
| Source revision | 1de9996b6be38b745688de084d87a5633f714e4e |
| Format | MLX fp16 safetensors |
| License posture | research/non-commercial |
| Readiness | benchmark-only |
| Sample rate | See upstream/runtime implementation. |
| Voice conditioning | Fish Speech reference / speaker conditioning stack |
| Runtime status | benchmark artifact; not for default product integration |
| Field | Value |
|---|---|
| Marker syntax | free-form inline bracket tags |
| Supported markers | [pause], [emphasis], [laughing], [excited], [angry], [whisper], [screaming], [shouting], [surprised], [sad] |
config.json - root config for Hugging Face download tracking and runtime metadatasoniqo_manifest.json - export manifest with source, marker, readiness, and file metadata*.safetensors - fp16-converted model weights with upstream key names preservedThese artifacts are for runtime implementation and evaluation. They are not a drop-in Transformers checkpoint.
git clone https://huggingface.co/aufklarer/Fish-Audio-S2-Pro-MLX-fp16
Quantized
Base model
fishaudio/s2-pro