Spaces:

sonic-speech
/

README

Running

App Files Files Community

flight505 commited on about 1 month ago

Commit

ecff087

verified ·

1 Parent(s): 5af3702

Update README.md

Browse files

Files changed (1) hide show

README.md +6 -7

README.md CHANGED Viewed

@@ -7,13 +7,13 @@ sdk: static
 pinned: false
 ---
-  # Sonic Speech
   Optimized speech models for Apple Silicon, powering [Sonic](https://github.com/flight505/sonic-workspace) — a local-first voice AI
   system. All models run entirely on-device using [MLX](https://github.com/ml-explore/mlx). No cloud, no API keys, no data leaves your
    Mac.
-  ## ASR — Parakeet TDT (NVIDIA, ported to MLX)
   SOTA English speech recognition with encoder-only mixed-precision quantization.
@@ -30,7 +30,7 @@ pinned: false
   **v3** supports 25 languages. **v2** is English-only. **INT8 recommended** — zero WER loss, 40% smaller, 30% faster.
-  ## TTS — Kokoro 82M (MLX)
   Fast text-to-speech with 32+ voices (American, British, Japanese, Chinese).
@@ -38,7 +38,7 @@ pinned: false
   |-------|------|------------|-------------|------------------|------|
   | [kokoro-82m-bf16](https://huggingface.co/sonic-speech/kokoro-82m-bf16) | ~170 MB | 47 ms | 224 ms | 126 ms | 41x |
-  ## Quantization Strategy
   Only the Conformer encoder (~85% of params) is quantized — the decoder stays BF16 for token precision.
@@ -47,7 +47,7 @@ pinned: false
   | INT8 | -40% | +30% | -58% | None |
   | INT4 | -61% | +34% | -67% | +0.4pp on real speech |
-  ## Quick Start
   ```python
   # ASR
@@ -59,5 +59,4 @@ pinned: false
   tts = SonicTTS(voice="af_heart")
   All benchmarks: Apple M3 Max 64 GB, macOS Sequoia, MLX 0.30.4. Built by https://huggingface.co/flight505.
-Edit this `README.md` markdown file to author your organization card.

 pinned: false
 ---
+# Sonic Speech
   Optimized speech models for Apple Silicon, powering [Sonic](https://github.com/flight505/sonic-workspace) — a local-first voice AI
   system. All models run entirely on-device using [MLX](https://github.com/ml-explore/mlx). No cloud, no API keys, no data leaves your
    Mac.
+## ASR — Parakeet TDT (NVIDIA, ported to MLX)
   SOTA English speech recognition with encoder-only mixed-precision quantization.
   **v3** supports 25 languages. **v2** is English-only. **INT8 recommended** — zero WER loss, 40% smaller, 30% faster.
+## TTS — Kokoro 82M (MLX)
   Fast text-to-speech with 32+ voices (American, British, Japanese, Chinese).
   |-------|------|------------|-------------|------------------|------|
   | [kokoro-82m-bf16](https://huggingface.co/sonic-speech/kokoro-82m-bf16) | ~170 MB | 47 ms | 224 ms | 126 ms | 41x |
+## Quantization Strategy
   Only the Conformer encoder (~85% of params) is quantized — the decoder stays BF16 for token precision.
   | INT8 | -40% | +30% | -58% | None |
   | INT4 | -61% | +34% | -67% | +0.4pp on real speech |
+## Quick Start
   ```python
   # ASR
   tts = SonicTTS(voice="af_heart")
   All benchmarks: Apple M3 Max 64 GB, macOS Sequoia, MLX 0.30.4. Built by https://huggingface.co/flight505.
+  ```