Text-to-Speech
ONNX
KittenTTS
English
tts
kokoro
piper
vits
styletts2
sherpa-onnx
on-device
threadcast
Instructions to use Pixel-Labs/threadcast-neural-models with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- KittenTTS
How to use Pixel-Labs/threadcast-neural-models with KittenTTS:
from kittentts import KittenTTS m = KittenTTS("Pixel-Labs/threadcast-neural-models") audio = m.generate("This high quality TTS model works without a GPU") # Save the audio import soundfile as sf sf.write('output.wav', audio, 24000) - Notebooks
- Google Colab
- Kaggle
File size: 7,735 Bytes
380bf54 07eeb33 380bf54 07eeb33 380bf54 07eeb33 380bf54 07eeb33 378619e 07eeb33 380bf54 07eeb33 380bf54 07eeb33 380bf54 07eeb33 380bf54 07eeb33 380bf54 07eeb33 380bf54 07eeb33 380bf54 07eeb33 380bf54 07eeb33 380bf54 07eeb33 380bf54 07eeb33 380bf54 378619e 380bf54 07eeb33 380bf54 07eeb33 380bf54 07eeb33 380bf54 07eeb33 380bf54 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 | # ThreadCast β Android Production Zips
Distributed mirror for the Android app's neural TTS assets. The eight zips here are downloaded by the app at runtime β first install of a neural engine pulls only what's needed (~74 MB for one Piper voice + shared data, ~26 MB for the Plus bundle, or ~145 MB for the full Kokoro bundle), and the user can manage each model individually from inside the app.
> Sibling: **`../extension/`** holds the Chrome extension's neural models in raw HF format. Both subtrees share the same engine families (Piper VITS, KittenTTS-nano, Kokoro StyleTTS2) β only the on-disk packaging differs.
---
## Layout
```
mobile-android/
βββ v1/
βββ threadcast-piper-shared-v1.zip (~11 MB) β espeak phonemizer data, downloaded once
βββ threadcast-piper-en_US-amy-medium-v1.zip (~63 MB) β Amy voice
βββ threadcast-piper-en_US-lessac-medium-v1.zip (~63 MB) β Lessac voice
βββ threadcast-piper-en_US-ryan-medium-v1.zip (~63 MB) β Ryan voice
βββ threadcast-piper-en_US-hfc_female-medium-v1.zip (~63 MB) β HFC Female voice
βββ threadcast-piper-en_US-hfc_male-medium-v1.zip (~63 MB) β HFC Male voice
βββ threadcast-kitten-nano-en-v1.zip (~26 MB) β KittenTTS nano v0.1 fp16 (all 8 voices; "Local AI Plus")
βββ threadcast-kokoro-int8-en-v1.zip (~145 MB) β Kokoro int8 v0.19 (all 11 voices; "Local AI Studio")
```
**Versioning:** the `v1/` segment is part of the URL the runtime requests. Bumping to `v2/` lets future format changes ship without breaking older app builds β old apps keep pulling `v1/`, new apps pull `v2/`.
---
## What's inside each zip
The native [`AssetInstaller`](https://github.com/Pixel-Labs/Reddit-Reader/blob/main/packages/mobile/modules/threadcast-neural/android/src/main/java/app/threadcast/neural/AssetInstaller.kt) extracts each zip directly under `filesDir/sherpa-piper/` (Piper) or `filesDir/sherpa-kokoro/` (Kokoro), so the zip's internal layout = the on-device layout. No re-rooting, no per-file rules.
### Piper β shared espeak data
```
threadcast-piper-shared-v1.zip
βββ espeak-ng-data/
βββ phontab
βββ phonindex
βββ phondata
βββ intonations
βββ lang/
βββ voices/
βββ β¦ (full espeak-ng tree)
```
Downloaded once on the user's first Piper voice install. Skipped on every subsequent voice install.
### Piper β one zip per voice
```
threadcast-piper-en_US-amy-medium-v1.zip
βββ en_US-amy-medium/
βββ en_US-amy-medium.onnx (~63 MB)
βββ tokens.txt
```
Five zips total β one per voice. Users only download the voices they want; selecting Amy doesn't pull Ryan.
### Kitten (Local AI Plus) β single bundle, all voices
```
threadcast-kitten-nano-en-v1.zip
βββ espeak-ng-data/ (separate copy from Piper's / Kokoro's β different on-disk root)
β βββ β¦
βββ model.fp16.onnx (~24 MB)
βββ voices.bin (~30 KB β 8 speaker embeddings)
βββ tokens.txt
```
One download serves all 8 Plus voices β same style-vector-lookup pattern as Kokoro. The 8 speakers are baked into `voices.bin` in the order documented in `packages/mobile/modules/threadcast-neural/index.ts::KITTEN_VOICES`. **Never write the upstream engine codename in user-facing surfaces** β the Android UI labels this engine as "Local AI Plus" everywhere.
### Kokoro (Local AI Studio) β single bundle, all voices
```
threadcast-kokoro-int8-en-v1.zip
βββ espeak-ng-data/ (separate copy from Piper's β different on-disk root)
β βββ β¦
βββ model.int8.onnx (~135 MB)
βββ voices.bin (~5.7 MB β concatenated speaker embeddings, all 11)
βββ tokens.txt
```
One download serves every Kokoro voice β switching speakers is a free style-vector lookup at synth time.
---
## How the runtime fetches these
**Mirror-and-fallback** β same pattern the extension uses for its `Pixel-Labs/threadcast-neural-models/{hf-cpu-mirror,hf-gpu-mirror}/` paths (see [`extension/src/offscreen/mirror-fetch.ts`](https://github.com/Pixel-Labs/Reddit-Reader/blob/main/packages/extension/src/offscreen/mirror-fetch.ts)). Each download is an ordered list of URLs; the installer tries each on failure and only surfaces an error if every URL is unreachable.
```
urls[0] = PRIMARY_BASE = https://huggingface.co/Pixel-Labs/threadcast-neural-models/resolve/main/mobile-android/v1
urls[1] = FALLBACK_BASE = (configurable β sibling HF repo, GitHub Release, private CDN, β¦)
```
Both bases need to serve the **same filenames** (the eight listed above). The fallback host is configured at app build time:
| Env var (set at build time) | Default | Purpose |
|---|---|---|
| `EXPO_PUBLIC_NEURAL_ASSETS_BASE_URL` | `https://huggingface.co/Pixel-Labs/threadcast-neural-models/resolve/main/mobile-android/v1` | Primary mirror |
| `EXPO_PUBLIC_NEURAL_ASSETS_FALLBACK_URL` | unset | Optional fallback (omit for primary-only) |
The native installer:
- Tries `urls[0]` first. On HTTP 4xx/5xx or transport error, logs a warning and tries `urls[1]`.
- Only the LAST URL's failure surfaces to the UI as a download error β transient mirror outages are silent.
- Cancellation aborts immediately at any URL boundary.
- Verifies post-extract that every required file exists with non-zero size; rejects malformed zips even if the HTTP layer succeeded.
---
## Publishing workflow
1. **Produce the zips** from the local sherpa-onnx staging tree:
```sh
pnpm --filter mobile produce:neural-zips
```
Writes to `packages/mobile/dist/neural-assets/`. The script reads the locally-staged sherpa-onnx upstream artifacts (Piper voice ONNX files + KittenTTS-nano v0.1 fp16 bundle + Kokoro int8 v0.19 bundle + shared espeak-ng-data) and emits the eight zips with the correct internal layouts.
2. **Upload to Hugging Face** β the user-facing primary mirror lives at:
<https://huggingface.co/Pixel-Labs/threadcast-neural-models/tree/main/mobile-android/v1>
Drop all eight zips into that tree. HF preserves filenames as-is. No build steps, no metadata munging β they're just static assets behind a CDN.
3. **(Optional) Mirror to a fallback host.** Same eight filenames at any HTTPS endpoint. Common picks:
- Sibling HF repo (`Pixel-Labs/threadcast-neural-models-mirror/v1/...`)
- GitHub Release with `gh release upload v1 dist/neural-assets/*.zip`
- Private CDN (R2, S3, etc.)
Then ship the next app build with `EXPO_PUBLIC_NEURAL_ASSETS_FALLBACK_URL` set to the mirror's base URL.
---
## Per-engine download cost (user-facing)
| User intent | Files pulled | Network |
|---|---|---|
| Install **first** Local AI Lite voice (e.g. Amy) | `threadcast-piper-shared-v1.zip` + `threadcast-piper-en_US-amy-medium-v1.zip` | ~74 MB |
| Install **another** Local AI Lite voice (e.g. Lessac) | `threadcast-piper-en_US-lessac-medium-v1.zip` | ~63 MB |
| Install Local AI Plus | `threadcast-kitten-nano-en-v1.zip` | ~26 MB |
| Install Local AI Studio | `threadcast-kokoro-int8-en-v1.zip` | ~145 MB |
| Install every engine, all 5 Lite voices + Plus + Studio | every zip in `v1/` | ~496 MB |
The whole-bundle worst case is comparable to Spotify's "download an album for offline" workflow. Most users will pick one tier and stop there β Plus is the sweet spot at 26 MB for an 8-voice multi-speaker model.
---
## License
Per-project licenses retained from upstream β see the [parent README](../README.md#license) for the consolidated summary.
|