Pixel-Labs
/

threadcast-neural-models

@@ -1,137 +1,94 @@
-# ThreadCast — Android Production Zips
-Distributed mirror for the Android app's neural TTS assets. The seven zips here are downloaded by the app at runtime — first install of a neural engine pulls only what's needed (~74 MB for one Piper voice + shared data, or ~145 MB for the full Kokoro bundle), and the user can manage each model individually from inside the app.
-> Sibling: **`../extension/`** holds the Chrome extension's neural models in raw HF format. Both subtrees share the same engine families (Piper VITS, Kokoro StyleTTS2) — only the on-disk packaging differs.
 ---
-## Layout
-```
-mobile-android/
-└── v1/
-    ├── threadcast-piper-shared-v1.zip                    (~11 MB) — espeak phonemizer data, downloaded once
-    ├── threadcast-piper-en_US-amy-medium-v1.zip          (~63 MB) — Amy voice
-    ├── threadcast-piper-en_US-lessac-medium-v1.zip       (~63 MB) — Lessac voice
-    ├── threadcast-piper-en_US-ryan-medium-v1.zip         (~63 MB) — Ryan voice
-    ├── threadcast-piper-en_US-hfc_female-medium-v1.zip   (~63 MB) — HFC Female voice
-    ├── threadcast-piper-en_US-hfc_male-medium-v1.zip     (~63 MB) — HFC Male voice
-    └── threadcast-kokoro-int8-en-v1.zip                  (~145 MB) — Kokoro int8 v0.19 (all 11 voices)
-```
-**Versioning:** the `v1/` segment is part of the URL the runtime requests. Bumping to `v2/` lets future format changes ship without breaking older app builds — old apps keep pulling `v1/`, new apps pull `v2/`.
 ---
-## What's inside each zip
-The native [`AssetInstaller`](https://github.com/Pixel-Labs/Reddit-Reader/blob/main/packages/mobile/modules/threadcast-neural/android/src/main/java/app/threadcast/neural/AssetInstaller.kt) extracts each zip directly under `filesDir/sherpa-piper/` (Piper) or `filesDir/sherpa-kokoro/` (Kokoro), so the zip's internal layout = the on-device layout. No re-rooting, no per-file rules.
-### Piper — shared espeak data
-```
-threadcast-piper-shared-v1.zip
-└── espeak-ng-data/
-    ├── phontab
-    ├── phonindex
-    ├── phondata
-    ├── intonations
-    ├── lang/
-    ├── voices/
-    └── … (full espeak-ng tree)
-```
-Downloaded once on the user's first Piper voice install. Skipped on every subsequent voice install.
-### Piper — one zip per voice
-```
-threadcast-piper-en_US-amy-medium-v1.zip
-└── en_US-amy-medium/
-    ├── en_US-amy-medium.onnx        (~63 MB)
-    └── tokens.txt
-```
-Five zips total — one per voice. Users only download the voices they want; selecting Amy doesn't pull Ryan.
-### Kokoro — single bundle, all voices
-```
-threadcast-kokoro-int8-en-v1.zip
-├── espeak-ng-data/                  (separate copy from Piper's — different on-disk root)
-│   └── …
-├── model.int8.onnx                  (~135 MB)
-├── voices.bin                       (~5.7 MB — concatenated speaker embeddings, all 11)
-└── tokens.txt
-```
-One download serves every Kokoro voice — switching speakers is a free style-vector lookup at synth time.
 ---
-## How the runtime fetches these
-**Mirror-and-fallback** — same pattern the extension uses for its `Pixel-Labs/threadcast-neural-models/{hf-cpu-mirror,hf-gpu-mirror}/` paths (see [`extension/src/offscreen/mirror-fetch.ts`](https://github.com/Pixel-Labs/Reddit-Reader/blob/main/packages/extension/src/offscreen/mirror-fetch.ts)). Each download is an ordered list of URLs; the installer tries each on failure and only surfaces an error if every URL is unreachable.
 ```
-urls[0] = PRIMARY_BASE   = https://huggingface.co/Pixel-Labs/threadcast-neural-models/resolve/main/mobile-android/v1
-urls[1] = FALLBACK_BASE  = (configurable — sibling HF repo, GitHub Release, private CDN, …)
 ```
-Both bases need to serve the **same filenames** (the seven listed above). The fallback host is configured at app build time:
-| Env var (set at build time) | Default | Purpose |
-|---|---|---|
-| `EXPO_PUBLIC_NEURAL_ASSETS_BASE_URL`     | `https://huggingface.co/Pixel-Labs/threadcast-neural-models/resolve/main/mobile-android/v1` | Primary mirror |
-| `EXPO_PUBLIC_NEURAL_ASSETS_FALLBACK_URL` | unset | Optional fallback (omit for primary-only) |
-The native installer:
-- Tries `urls[0]` first. On HTTP 4xx/5xx or transport error, logs a warning and tries `urls[1]`.
-- Only the LAST URL's failure surfaces to the UI as a download error — transient mirror outages are silent.
-- Cancellation aborts immediately at any URL boundary.
-- Verifies post-extract that every required file exists with non-zero size; rejects malformed zips even if the HTTP layer succeeded.
 ---
-## Publishing workflow
-1. **Produce the zips** from the local sherpa-onnx staging tree:
-   ```sh
-   pnpm --filter mobile produce:neural-zips
-   ```
-   Writes to `packages/mobile/dist/neural-assets/`. The script reads the locally-staged sherpa-onnx upstream artifacts (Piper voice ONNX files + Kokoro int8 v0.19 bundle + shared espeak-ng-data) and emits the seven zips with the correct internal layouts.
-2. **Upload to Hugging Face** — the user-facing primary mirror lives at:
-   <https://huggingface.co/Pixel-Labs/threadcast-neural-models/tree/main/mobile-android/v1>
-   Drop all seven zips into that tree. HF preserves filenames as-is. No build steps, no metadata munging — they're just static assets behind a CDN.
-3. **(Optional) Mirror to a fallback host.** Same seven filenames at any HTTPS endpoint. Common picks:
-   - Sibling HF repo (`Pixel-Labs/threadcast-neural-models-mirror/v1/...`)
-   - GitHub Release with `gh release upload v1 dist/neural-assets/*.zip`
-   - Private CDN (R2, S3, etc.)
-   Then ship the next app build with `EXPO_PUBLIC_NEURAL_ASSETS_FALLBACK_URL` set to the mirror's base URL.
 ---
-## Per-engine download cost (user-facing)
-| User intent | Files pulled | Network |
-|---|---|---|
-| Install **first** Piper voice (e.g. Amy) | `threadcast-piper-shared-v1.zip` + `threadcast-piper-en_US-amy-medium-v1.zip` | ~74 MB |
-| Install **another** Piper voice (e.g. Lessac) | `threadcast-piper-en_US-lessac-medium-v1.zip` | ~63 MB |
-| Install Local AI Studio (Kokoro) | `threadcast-kokoro-int8-en-v1.zip` | ~145 MB |
-| Install both engines, all 5 Piper voices + Kokoro | every zip in `v1/` | ~470 MB |
-The whole-bundle worst case is comparable to Spotify's "download an album for offline" workflow. Most users will probably pick one Piper voice OR Kokoro and stop there.
 ---
-## License
-Per-project licenses retained from upstream — see the [parent README](../README.md#license) for the consolidated summary.

 ---
+license: mit
+language: en
+library_name: onnx
+tags:
+  - text-to-speech
+  - tts
+  - kokoro
+  - piper
+  - kittentts
+  - vits
+  - styletts2
+  - onnx
+  - sherpa-onnx
+  - on-device
+  - threadcast
+pipeline_tag: text-to-speech
 ---
+<p align="center">
+  <a href="https://threadcast.app">
+    <img src="assets/logo.png" alt="ThreadCast" width="160" height="160" />
+  </a>
+</p>
+<h1 align="center">ThreadCast — Neural Models Mirror</h1>
+<p align="center">
+  <em>Threads, now a podcast.</em><br/>
+  <a href="https://threadcast.app">threadcast.app</a> · <a href="https://pixellabs.ventures">pixellabs.ventures</a>
+</p>
+---
+Self-hosted mirror of the on-device neural TTS models used by **[ThreadCast](https://threadcast.app)** across both shipping platforms — the Chrome extension and the Android app. Three engine families on Android (Piper VITS, KittenTTS-nano VITS, Kokoro StyleTTS2), two on the extension, one source of truth.
+This repository exists so each platform can ship a stable, version-pinned set of model weights without depending on the availability or rate-limits of upstream Hugging Face repos at runtime.
+> **Note:** if you're a ThreadCast user, you don't need anything here — the extension and the Android app each download (or bundle) what they need automatically. This page is for transparency, contributors, and forks.
 ---
+## Repository layout
 ```
+threadcast-neural-models/
+├── extension/                          ← Chrome extension — HF transformers.js packaging
+│   ├── neural-28m/                     Piper VITS — 5 voices, raw HF format
+│   └── neural-82m/                     Kokoro StyleTTS2 — 1 model + 11 voice embeddings
+│
+└── mobile-android/                     ← Android app — production zips fetched at runtime
+    └── v1/                             8 zips: 1 shared espeak + 5 per-voice Piper
+                                        + 1 KittenTTS-nano ("Local AI Plus")
+                                        + 1 Kokoro ("Local AI Studio")
 ```
+| Subtree | Format | Consumed by | Sub-README |
+|---|---|---|---|
+| `extension/` | Raw HF (per-file `.onnx`, `.bin`, `tokenizer.json`) | Chrome extension via `@huggingface/transformers` + `@realtimex/piper-tts-web` | [extension/README.md](extension/README.md) |
+| `mobile-android/` | ZIP archives, sherpa-onnx packaging | Android app at runtime via [AssetInstaller.kt](https://github.com/Pixel-Labs/Reddit-Reader/blob/main/packages/mobile/modules/threadcast-neural/android/src/main/java/app/threadcast/neural/AssetInstaller.kt) — first-launch download with cancel/delete | [mobile-android/README.md](mobile-android/README.md) |
+The two subtrees parallel each other on purpose — same engine families (Piper VITS, Kokoro StyleTTS2), same `neural-28m` / `neural-82m` parameter-count naming, just packaged for each platform's runtime.
 ---
+## Engines at a glance
+| Engine | Architecture | Params | Per-voice cost | Quality tier |
+|---|---|---|---|---|
+| **`neural-28m`** | Piper VITS | ~28 M | One ONNX file per voice (~63 MB) | Standard — fast, CPU-friendly, single-thread WASM real-time on a laptop. Surfaced on Android as **Local AI Lite**, on the extension as **AI Neural CPU**. |
+| **`neural-15m`** | KittenTTS-nano VITS | ~15 M | Single fp16 model + 8 speaker embeddings (one ~26 MB file serves all) | Sweet spot — 8 voices with style-vector switching at a fraction of the storage cost. Android-only, surfaced as **Local AI Plus**. |
+| **`neural-82m`** | Kokoro StyleTTS2 | ~82 M | Single model + 256-dim style vectors per voice (one ~325 MB file serves all) | Premium — more natural prosody, GPU-accelerated on Chrome (WebGPU); CPU-only on Android (perf-gated). Surfaced on Android as **Local AI Studio**, on the extension as **AI Neural GPU**. |
 ---
+## License
+This repository **mirrors** upstream models for distribution stability. Each upstream project retains its own license:
+- **Kokoro-82M:** Apache-2.0 ([upstream model card](https://huggingface.co/hexgrad/Kokoro-82M))
+- **KittenTTS-nano (v0.1):** Apache-2.0 ([upstream model card](https://huggingface.co/KittenML/kitten-tts-nano-0.1))
+- **Piper voices:** MIT, with individual voice attributions in each `.onnx.json`
+- **transformers.js, onnxruntime-web, onnxruntime-android:** Apache-2.0
+- **sherpa-onnx:** Apache-2.0
+The mirror layout, READMEs, and any custom additions in this repository are licensed under MIT by [Pixel Labs](https://pixellabs.ventures).
 ---
+## Links
+- 🌐 ThreadCast: [threadcast.app](https://threadcast.app)
+- 🧑‍💻 Pixel Labs: [pixellabs.ventures](https://pixellabs.ventures)
+- 🐦 Issues / questions: open an issue on the [ThreadCast extension repo](https://threadcast.app/support)