Pixel-Labs commited on
Commit
378619e
Β·
verified Β·
1 Parent(s): 8b412bd

Upload README.md

Browse files
Files changed (1) hide show
  1. mobile-android/README.md +64 -107
mobile-android/README.md CHANGED
@@ -1,137 +1,94 @@
1
- # ThreadCast β€” Android Production Zips
2
-
3
- Distributed mirror for the Android app's neural TTS assets. The seven zips here are downloaded by the app at runtime β€” first install of a neural engine pulls only what's needed (~74 MB for one Piper voice + shared data, or ~145 MB for the full Kokoro bundle), and the user can manage each model individually from inside the app.
4
-
5
- > Sibling: **`../extension/`** holds the Chrome extension's neural models in raw HF format. Both subtrees share the same engine families (Piper VITS, Kokoro StyleTTS2) β€” only the on-disk packaging differs.
6
-
7
  ---
8
-
9
- ## Layout
10
-
11
- ```
12
- mobile-android/
13
- └── v1/
14
- β”œβ”€β”€ threadcast-piper-shared-v1.zip (~11 MB) β€” espeak phonemizer data, downloaded once
15
- β”œβ”€β”€ threadcast-piper-en_US-amy-medium-v1.zip (~63 MB) β€” Amy voice
16
- β”œβ”€β”€ threadcast-piper-en_US-lessac-medium-v1.zip (~63 MB) β€” Lessac voice
17
- β”œβ”€β”€ threadcast-piper-en_US-ryan-medium-v1.zip (~63 MB) β€” Ryan voice
18
- β”œβ”€β”€ threadcast-piper-en_US-hfc_female-medium-v1.zip (~63 MB) β€” HFC Female voice
19
- β”œβ”€β”€ threadcast-piper-en_US-hfc_male-medium-v1.zip (~63 MB) β€” HFC Male voice
20
- └── threadcast-kokoro-int8-en-v1.zip (~145 MB) β€” Kokoro int8 v0.19 (all 11 voices)
21
- ```
22
-
23
- **Versioning:** the `v1/` segment is part of the URL the runtime requests. Bumping to `v2/` lets future format changes ship without breaking older app builds β€” old apps keep pulling `v1/`, new apps pull `v2/`.
24
-
25
  ---
26
 
27
- ## What's inside each zip
28
-
29
- The native [`AssetInstaller`](https://github.com/Pixel-Labs/Reddit-Reader/blob/main/packages/mobile/modules/threadcast-neural/android/src/main/java/app/threadcast/neural/AssetInstaller.kt) extracts each zip directly under `filesDir/sherpa-piper/` (Piper) or `filesDir/sherpa-kokoro/` (Kokoro), so the zip's internal layout = the on-device layout. No re-rooting, no per-file rules.
30
-
31
- ### Piper β€” shared espeak data
32
-
33
- ```
34
- threadcast-piper-shared-v1.zip
35
- └── espeak-ng-data/
36
- β”œβ”€β”€ phontab
37
- β”œβ”€β”€ phonindex
38
- β”œβ”€β”€ phondata
39
- β”œβ”€β”€ intonations
40
- β”œβ”€β”€ lang/
41
- β”œβ”€β”€ voices/
42
- └── … (full espeak-ng tree)
43
- ```
44
 
45
- Downloaded once on the user's first Piper voice install. Skipped on every subsequent voice install.
46
 
47
- ### Piper β€” one zip per voice
48
-
49
- ```
50
- threadcast-piper-en_US-amy-medium-v1.zip
51
- └── en_US-amy-medium/
52
- β”œβ”€β”€ en_US-amy-medium.onnx (~63 MB)
53
- └── tokens.txt
54
- ```
55
 
56
- Five zips total β€” one per voice. Users only download the voices they want; selecting Amy doesn't pull Ryan.
57
 
58
- ### Kokoro β€” single bundle, all voices
59
 
60
- ```
61
- threadcast-kokoro-int8-en-v1.zip
62
- β”œβ”€β”€ espeak-ng-data/ (separate copy from Piper's β€” different on-disk root)
63
- β”‚ └── …
64
- β”œβ”€β”€ model.int8.onnx (~135 MB)
65
- β”œβ”€β”€ voices.bin (~5.7 MB β€” concatenated speaker embeddings, all 11)
66
- └── tokens.txt
67
- ```
68
 
69
- One download serves every Kokoro voice β€” switching speakers is a free style-vector lookup at synth time.
70
 
71
  ---
72
 
73
- ## How the runtime fetches these
74
-
75
- **Mirror-and-fallback** β€” same pattern the extension uses for its `Pixel-Labs/threadcast-neural-models/{hf-cpu-mirror,hf-gpu-mirror}/` paths (see [`extension/src/offscreen/mirror-fetch.ts`](https://github.com/Pixel-Labs/Reddit-Reader/blob/main/packages/extension/src/offscreen/mirror-fetch.ts)). Each download is an ordered list of URLs; the installer tries each on failure and only surfaces an error if every URL is unreachable.
76
 
77
  ```
78
- urls[0] = PRIMARY_BASE = https://huggingface.co/Pixel-Labs/threadcast-neural-models/resolve/main/mobile-android/v1
79
- urls[1] = FALLBACK_BASE = (configurable β€” sibling HF repo, GitHub Release, private CDN, …)
 
 
 
 
 
 
 
80
  ```
81
 
82
- Both bases need to serve the **same filenames** (the seven listed above). The fallback host is configured at app build time:
83
-
84
- | Env var (set at build time) | Default | Purpose |
85
- |---|---|---|
86
- | `EXPO_PUBLIC_NEURAL_ASSETS_BASE_URL` | `https://huggingface.co/Pixel-Labs/threadcast-neural-models/resolve/main/mobile-android/v1` | Primary mirror |
87
- | `EXPO_PUBLIC_NEURAL_ASSETS_FALLBACK_URL` | unset | Optional fallback (omit for primary-only) |
88
 
89
- The native installer:
90
- - Tries `urls[0]` first. On HTTP 4xx/5xx or transport error, logs a warning and tries `urls[1]`.
91
- - Only the LAST URL's failure surfaces to the UI as a download error β€” transient mirror outages are silent.
92
- - Cancellation aborts immediately at any URL boundary.
93
- - Verifies post-extract that every required file exists with non-zero size; rejects malformed zips even if the HTTP layer succeeded.
94
 
95
  ---
96
 
97
- ## Publishing workflow
98
-
99
- 1. **Produce the zips** from the local sherpa-onnx staging tree:
100
-
101
- ```sh
102
- pnpm --filter mobile produce:neural-zips
103
- ```
104
 
105
- Writes to `packages/mobile/dist/neural-assets/`. The script reads the locally-staged sherpa-onnx upstream artifacts (Piper voice ONNX files + Kokoro int8 v0.19 bundle + shared espeak-ng-data) and emits the seven zips with the correct internal layouts.
106
-
107
- 2. **Upload to Hugging Face** β€” the user-facing primary mirror lives at:
108
-
109
- <https://huggingface.co/Pixel-Labs/threadcast-neural-models/tree/main/mobile-android/v1>
110
-
111
- Drop all seven zips into that tree. HF preserves filenames as-is. No build steps, no metadata munging β€” they're just static assets behind a CDN.
112
-
113
- 3. **(Optional) Mirror to a fallback host.** Same seven filenames at any HTTPS endpoint. Common picks:
114
- - Sibling HF repo (`Pixel-Labs/threadcast-neural-models-mirror/v1/...`)
115
- - GitHub Release with `gh release upload v1 dist/neural-assets/*.zip`
116
- - Private CDN (R2, S3, etc.)
117
-
118
- Then ship the next app build with `EXPO_PUBLIC_NEURAL_ASSETS_FALLBACK_URL` set to the mirror's base URL.
119
 
120
  ---
121
 
122
- ## Per-engine download cost (user-facing)
 
 
123
 
124
- | User intent | Files pulled | Network |
125
- |---|---|---|
126
- | Install **first** Piper voice (e.g. Amy) | `threadcast-piper-shared-v1.zip` + `threadcast-piper-en_US-amy-medium-v1.zip` | ~74 MB |
127
- | Install **another** Piper voice (e.g. Lessac) | `threadcast-piper-en_US-lessac-medium-v1.zip` | ~63 MB |
128
- | Install Local AI Studio (Kokoro) | `threadcast-kokoro-int8-en-v1.zip` | ~145 MB |
129
- | Install both engines, all 5 Piper voices + Kokoro | every zip in `v1/` | ~470 MB |
130
 
131
- The whole-bundle worst case is comparable to Spotify's "download an album for offline" workflow. Most users will probably pick one Piper voice OR Kokoro and stop there.
132
 
133
  ---
134
 
135
- ## License
136
 
137
- Per-project licenses retained from upstream β€” see the [parent README](../README.md#license) for the consolidated summary.
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: mit
3
+ language: en
4
+ library_name: onnx
5
+ tags:
6
+ - text-to-speech
7
+ - tts
8
+ - kokoro
9
+ - piper
10
+ - kittentts
11
+ - vits
12
+ - styletts2
13
+ - onnx
14
+ - sherpa-onnx
15
+ - on-device
16
+ - threadcast
17
+ pipeline_tag: text-to-speech
 
18
  ---
19
 
20
+ <p align="center">
21
+ <a href="https://threadcast.app">
22
+ <img src="assets/logo.png" alt="ThreadCast" width="160" height="160" />
23
+ </a>
24
+ </p>
 
 
 
 
 
 
 
 
 
 
 
 
25
 
26
+ <h1 align="center">ThreadCast β€” Neural Models Mirror</h1>
27
 
28
+ <p align="center">
29
+ <em>Threads, now a podcast.</em><br/>
30
+ <a href="https://threadcast.app">threadcast.app</a> Β· <a href="https://pixellabs.ventures">pixellabs.ventures</a>
31
+ </p>
 
 
 
 
32
 
33
+ ---
34
 
35
+ Self-hosted mirror of the on-device neural TTS models used by **[ThreadCast](https://threadcast.app)** across both shipping platforms β€” the Chrome extension and the Android app. Three engine families on Android (Piper VITS, KittenTTS-nano VITS, Kokoro StyleTTS2), two on the extension, one source of truth.
36
 
37
+ This repository exists so each platform can ship a stable, version-pinned set of model weights without depending on the availability or rate-limits of upstream Hugging Face repos at runtime.
 
 
 
 
 
 
 
38
 
39
+ > **Note:** if you're a ThreadCast user, you don't need anything here β€” the extension and the Android app each download (or bundle) what they need automatically. This page is for transparency, contributors, and forks.
40
 
41
  ---
42
 
43
+ ## Repository layout
 
 
44
 
45
  ```
46
+ threadcast-neural-models/
47
+ β”œβ”€β”€ extension/ ← Chrome extension β€” HF transformers.js packaging
48
+ β”‚ β”œβ”€β”€ neural-28m/ Piper VITS β€” 5 voices, raw HF format
49
+ β”‚ └── neural-82m/ Kokoro StyleTTS2 β€” 1 model + 11 voice embeddings
50
+ β”‚
51
+ └── mobile-android/ ← Android app β€” production zips fetched at runtime
52
+ └── v1/ 8 zips: 1 shared espeak + 5 per-voice Piper
53
+ + 1 KittenTTS-nano ("Local AI Plus")
54
+ + 1 Kokoro ("Local AI Studio")
55
  ```
56
 
57
+ | Subtree | Format | Consumed by | Sub-README |
58
+ |---|---|---|---|
59
+ | `extension/` | Raw HF (per-file `.onnx`, `.bin`, `tokenizer.json`) | Chrome extension via `@huggingface/transformers` + `@realtimex/piper-tts-web` | [extension/README.md](extension/README.md) |
60
+ | `mobile-android/` | ZIP archives, sherpa-onnx packaging | Android app at runtime via [AssetInstaller.kt](https://github.com/Pixel-Labs/Reddit-Reader/blob/main/packages/mobile/modules/threadcast-neural/android/src/main/java/app/threadcast/neural/AssetInstaller.kt) β€” first-launch download with cancel/delete | [mobile-android/README.md](mobile-android/README.md) |
 
 
61
 
62
+ The two subtrees parallel each other on purpose β€” same engine families (Piper VITS, Kokoro StyleTTS2), same `neural-28m` / `neural-82m` parameter-count naming, just packaged for each platform's runtime.
 
 
 
 
63
 
64
  ---
65
 
66
+ ## Engines at a glance
 
 
 
 
 
 
67
 
68
+ | Engine | Architecture | Params | Per-voice cost | Quality tier |
69
+ |---|---|---|---|---|
70
+ | **`neural-28m`** | Piper VITS | ~28 M | One ONNX file per voice (~63 MB) | Standard β€” fast, CPU-friendly, single-thread WASM real-time on a laptop. Surfaced on Android as **Local AI Lite**, on the extension as **AI Neural CPU**. |
71
+ | **`neural-15m`** | KittenTTS-nano VITS | ~15 M | Single fp16 model + 8 speaker embeddings (one ~26 MB file serves all) | Sweet spot β€” 8 voices with style-vector switching at a fraction of the storage cost. Android-only, surfaced as **Local AI Plus**. |
72
+ | **`neural-82m`** | Kokoro StyleTTS2 | ~82 M | Single model + 256-dim style vectors per voice (one ~325 MB file serves all) | Premium β€” more natural prosody, GPU-accelerated on Chrome (WebGPU); CPU-only on Android (perf-gated). Surfaced on Android as **Local AI Studio**, on the extension as **AI Neural GPU**. |
 
 
 
 
 
 
 
 
 
73
 
74
  ---
75
 
76
+ ## License
77
+
78
+ This repository **mirrors** upstream models for distribution stability. Each upstream project retains its own license:
79
 
80
+ - **Kokoro-82M:** Apache-2.0 ([upstream model card](https://huggingface.co/hexgrad/Kokoro-82M))
81
+ - **KittenTTS-nano (v0.1):** Apache-2.0 ([upstream model card](https://huggingface.co/KittenML/kitten-tts-nano-0.1))
82
+ - **Piper voices:** MIT, with individual voice attributions in each `.onnx.json`
83
+ - **transformers.js, onnxruntime-web, onnxruntime-android:** Apache-2.0
84
+ - **sherpa-onnx:** Apache-2.0
 
85
 
86
+ The mirror layout, READMEs, and any custom additions in this repository are licensed under MIT by [Pixel Labs](https://pixellabs.ventures).
87
 
88
  ---
89
 
90
+ ## Links
91
 
92
+ - 🌐 ThreadCast: [threadcast.app](https://threadcast.app)
93
+ - πŸ§‘β€πŸ’» Pixel Labs: [pixellabs.ventures](https://pixellabs.ventures)
94
+ - 🐦 Issues / questions: open an issue on the [ThreadCast extension repo](https://threadcast.app/support)