Text-to-Speech
ONNX
KittenTTS
English
tts
kokoro
piper
vits
styletts2
sherpa-onnx
on-device
threadcast
Instructions to use Pixel-Labs/threadcast-neural-models with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- KittenTTS
How to use Pixel-Labs/threadcast-neural-models with KittenTTS:
from kittentts import KittenTTS m = KittenTTS("Pixel-Labs/threadcast-neural-models") audio = m.generate("This high quality TTS model works without a GPU") # Save the audio import soundfile as sf sf.write('output.wav', audio, 24000) - Notebooks
- Google Colab
- Kaggle
Upload README.md
Browse files- mobile-android/README.md +147 -0
mobile-android/README.md
ADDED
|
@@ -0,0 +1,147 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# ThreadCast β Android Production Zips
|
| 2 |
+
|
| 3 |
+
Distributed mirror for the Android app's neural TTS assets. The seven zips here are downloaded by the app at runtime β first install of a neural engine pulls only what's needed (~74 MB for one Piper voice + shared data, or ~145 MB for the full Kokoro bundle), and the user can manage each model individually from inside the app.
|
| 4 |
+
|
| 5 |
+
> Sibling layout:
|
| 6 |
+
>
|
| 7 |
+
> - **`../extension/`** β Chrome extension models (raw HF format, published).
|
| 8 |
+
> - **`../android/`** β local dev staging for sherpa-onnx upstream artifacts (not published).
|
| 9 |
+
> - **`./mobile-android/` (you are here)** β the zips actually shipped to users (published to HF).
|
| 10 |
+
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
## Layout
|
| 14 |
+
|
| 15 |
+
```
|
| 16 |
+
mobile-android/
|
| 17 |
+
βββ v1/
|
| 18 |
+
βββ threadcast-piper-shared-v1.zip (~11 MB) β espeak phonemizer data, downloaded once
|
| 19 |
+
βββ threadcast-piper-en_US-amy-medium-v1.zip (~63 MB) β Amy voice
|
| 20 |
+
βββ threadcast-piper-en_US-lessac-medium-v1.zip (~63 MB) β Lessac voice
|
| 21 |
+
βββ threadcast-piper-en_US-ryan-medium-v1.zip (~63 MB) β Ryan voice
|
| 22 |
+
βββ threadcast-piper-en_US-hfc_female-medium-v1.zip (~63 MB) β HFC Female voice
|
| 23 |
+
βββ threadcast-piper-en_US-hfc_male-medium-v1.zip (~63 MB) β HFC Male voice
|
| 24 |
+
βββ threadcast-kokoro-int8-en-v1.zip (~145 MB) β Kokoro int8 v0.19 (all 11 voices)
|
| 25 |
+
```
|
| 26 |
+
|
| 27 |
+
**Versioning:** the `v1/` segment is part of the URL the runtime requests. Bumping to `v2/` lets future format changes ship without breaking older app builds β old apps keep pulling `v1/`, new apps pull `v2/`.
|
| 28 |
+
|
| 29 |
+
---
|
| 30 |
+
|
| 31 |
+
## What's inside each zip
|
| 32 |
+
|
| 33 |
+
The native [`AssetInstaller`](https://github.com/Pixel-Labs/Reddit-Reader/blob/main/packages/mobile/modules/threadcast-neural/android/src/main/java/app/threadcast/neural/AssetInstaller.kt) extracts each zip directly under `filesDir/sherpa-piper/` (Piper) or `filesDir/sherpa-kokoro/` (Kokoro), so the zip's internal layout = the on-device layout. No re-rooting, no per-file rules.
|
| 34 |
+
|
| 35 |
+
### Piper β shared espeak data
|
| 36 |
+
|
| 37 |
+
```
|
| 38 |
+
threadcast-piper-shared-v1.zip
|
| 39 |
+
βββ espeak-ng-data/
|
| 40 |
+
βββ phontab
|
| 41 |
+
βββ phonindex
|
| 42 |
+
βββ phondata
|
| 43 |
+
βββ intonations
|
| 44 |
+
βββ lang/
|
| 45 |
+
βββ voices/
|
| 46 |
+
βββ β¦ (full espeak-ng tree)
|
| 47 |
+
```
|
| 48 |
+
|
| 49 |
+
Downloaded once on the user's first Piper voice install. Skipped on every subsequent voice install.
|
| 50 |
+
|
| 51 |
+
### Piper β one zip per voice
|
| 52 |
+
|
| 53 |
+
```
|
| 54 |
+
threadcast-piper-en_US-amy-medium-v1.zip
|
| 55 |
+
βββ en_US-amy-medium/
|
| 56 |
+
βββ en_US-amy-medium.onnx (~63 MB)
|
| 57 |
+
βββ tokens.txt
|
| 58 |
+
```
|
| 59 |
+
|
| 60 |
+
Five zips total β one per voice. Users only download the voices they want; selecting Amy doesn't pull Ryan.
|
| 61 |
+
|
| 62 |
+
### Kokoro β single bundle, all voices
|
| 63 |
+
|
| 64 |
+
```
|
| 65 |
+
threadcast-kokoro-int8-en-v1.zip
|
| 66 |
+
βββ espeak-ng-data/ (separate copy from Piper's β different on-disk root)
|
| 67 |
+
β βββ β¦
|
| 68 |
+
βββ model.int8.onnx (~135 MB)
|
| 69 |
+
βββ voices.bin (~5.7 MB β concatenated speaker embeddings, all 11)
|
| 70 |
+
βββ tokens.txt
|
| 71 |
+
```
|
| 72 |
+
|
| 73 |
+
One download serves every Kokoro voice β switching speakers is a free style-vector lookup at synth time.
|
| 74 |
+
|
| 75 |
+
---
|
| 76 |
+
|
| 77 |
+
## How the runtime fetches these
|
| 78 |
+
|
| 79 |
+
**Mirror-and-fallback** β same pattern the extension uses for its `Pixel-Labs/threadcast-neural-models/{hf-cpu-mirror,hf-gpu-mirror}/` paths (see [`extension/src/offscreen/mirror-fetch.ts`](https://github.com/Pixel-Labs/Reddit-Reader/blob/main/packages/extension/src/offscreen/mirror-fetch.ts)). Each download is an ordered list of URLs; the installer tries each on failure and only surfaces an error if every URL is unreachable.
|
| 80 |
+
|
| 81 |
+
```
|
| 82 |
+
urls[0] = PRIMARY_BASE = https://huggingface.co/Pixel-Labs/threadcast-neural-models/resolve/main/mobile-android/v1
|
| 83 |
+
urls[1] = FALLBACK_BASE = (configurable β sibling HF repo, GitHub Release, private CDN, β¦)
|
| 84 |
+
```
|
| 85 |
+
|
| 86 |
+
Both bases need to serve the **same filenames** (the seven listed above). The fallback host is configured at app build time:
|
| 87 |
+
|
| 88 |
+
| Env var (set at build time) | Default | Purpose |
|
| 89 |
+
|---|---|---|
|
| 90 |
+
| `EXPO_PUBLIC_NEURAL_ASSETS_BASE_URL` | `https://huggingface.co/Pixel-Labs/threadcast-neural-models/resolve/main/mobile-android/v1` | Primary mirror |
|
| 91 |
+
| `EXPO_PUBLIC_NEURAL_ASSETS_FALLBACK_URL` | unset | Optional fallback (omit for primary-only) |
|
| 92 |
+
|
| 93 |
+
The native installer:
|
| 94 |
+
- Tries `urls[0]` first. On HTTP 4xx/5xx or transport error, logs a warning and tries `urls[1]`.
|
| 95 |
+
- Only the LAST URL's failure surfaces to the UI as a download error β transient mirror outages are silent.
|
| 96 |
+
- Cancellation aborts immediately at any URL boundary.
|
| 97 |
+
- Verifies post-extract that every required file exists with non-zero size; rejects malformed zips even if the HTTP layer succeeded.
|
| 98 |
+
|
| 99 |
+
---
|
| 100 |
+
|
| 101 |
+
## Publishing workflow
|
| 102 |
+
|
| 103 |
+
1. **Produce the zips** from `../android/`:
|
| 104 |
+
|
| 105 |
+
```sh
|
| 106 |
+
pnpm --filter mobile produce:neural-zips
|
| 107 |
+
```
|
| 108 |
+
|
| 109 |
+
Writes to `packages/mobile/dist/neural-assets/`. The script reads sherpa-onnx upstream artifacts in `../android/{neural-28m,neural-82m}/` and emits the seven zips with the correct internal layouts.
|
| 110 |
+
|
| 111 |
+
2. **Copy into this folder** (for local convenience and snapshotting):
|
| 112 |
+
|
| 113 |
+
```sh
|
| 114 |
+
cp packages/mobile/dist/neural-assets/*.zip "AI Neural Models/mobile-android/v1/"
|
| 115 |
+
```
|
| 116 |
+
|
| 117 |
+
3. **Upload to Hugging Face** β the user-facing primary mirror lives at:
|
| 118 |
+
|
| 119 |
+
<https://huggingface.co/Pixel-Labs/threadcast-neural-models/tree/main/mobile-android/v1>
|
| 120 |
+
|
| 121 |
+
Drop all seven zips into that tree. HF preserves filenames as-is. No build steps, no metadata munging β they're just static assets behind a CDN.
|
| 122 |
+
|
| 123 |
+
4. **(Optional) Mirror to a fallback host.** Same seven filenames at any HTTPS endpoint. Common picks:
|
| 124 |
+
- Sibling HF repo (`Pixel-Labs/threadcast-neural-models-mirror/v1/...`)
|
| 125 |
+
- GitHub Release with `gh release upload v1 dist/neural-assets/*.zip`
|
| 126 |
+
- Private CDN (R2, S3, etc.)
|
| 127 |
+
|
| 128 |
+
Then ship the next app build with `EXPO_PUBLIC_NEURAL_ASSETS_FALLBACK_URL` set to the mirror's base URL.
|
| 129 |
+
|
| 130 |
+
---
|
| 131 |
+
|
| 132 |
+
## Per-engine download cost (user-facing)
|
| 133 |
+
|
| 134 |
+
| User intent | Files pulled | Network |
|
| 135 |
+
|---|---|---|
|
| 136 |
+
| Install **first** Piper voice (e.g. Amy) | `threadcast-piper-shared-v1.zip` + `threadcast-piper-en_US-amy-medium-v1.zip` | ~74 MB |
|
| 137 |
+
| Install **another** Piper voice (e.g. Lessac) | `threadcast-piper-en_US-lessac-medium-v1.zip` | ~63 MB |
|
| 138 |
+
| Install Local AI Studio (Kokoro) | `threadcast-kokoro-int8-en-v1.zip` | ~145 MB |
|
| 139 |
+
| Install both engines, all 5 Piper voices + Kokoro | every zip in `v1/` | ~470 MB |
|
| 140 |
+
|
| 141 |
+
The whole-bundle worst case is comparable to Spotify's "download an album for offline" workflow. Most users will probably pick one Piper voice OR Kokoro and stop there.
|
| 142 |
+
|
| 143 |
+
---
|
| 144 |
+
|
| 145 |
+
## License
|
| 146 |
+
|
| 147 |
+
Per-project licenses retained from upstream β see the [parent README](../README.md#license) for the consolidated summary.
|