File size: 7,735 Bytes
380bf54
 
 
 
 
 
07eeb33
 
380bf54
07eeb33
380bf54
 
 
 
 
 
 
 
 
 
 
 
07eeb33
380bf54
07eeb33
378619e
07eeb33
380bf54
07eeb33
380bf54
07eeb33
380bf54
07eeb33
380bf54
 
 
 
 
 
 
 
 
 
 
 
 
07eeb33
380bf54
07eeb33
 
380bf54
 
 
 
07eeb33
 
380bf54
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
07eeb33
380bf54
07eeb33
 
 
380bf54
07eeb33
380bf54
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
07eeb33
 
 
380bf54
 
 
 
 
 
 
378619e
380bf54
07eeb33
380bf54
07eeb33
380bf54
 
 
 
 
 
 
 
 
 
07eeb33
 
 
380bf54
 
 
 
 
 
 
 
 
 
 
 
 
 
 
07eeb33
380bf54
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
# ThreadCast β€” Android Production Zips

Distributed mirror for the Android app's neural TTS assets. The eight zips here are downloaded by the app at runtime β€” first install of a neural engine pulls only what's needed (~74 MB for one Piper voice + shared data, ~26 MB for the Plus bundle, or ~145 MB for the full Kokoro bundle), and the user can manage each model individually from inside the app.

> Sibling: **`../extension/`** holds the Chrome extension's neural models in raw HF format. Both subtrees share the same engine families (Piper VITS, KittenTTS-nano, Kokoro StyleTTS2) β€” only the on-disk packaging differs.

---

## Layout

```
mobile-android/
└── v1/
    β”œβ”€β”€ threadcast-piper-shared-v1.zip                    (~11 MB) β€” espeak phonemizer data, downloaded once
    β”œβ”€β”€ threadcast-piper-en_US-amy-medium-v1.zip          (~63 MB) β€” Amy voice
    β”œβ”€β”€ threadcast-piper-en_US-lessac-medium-v1.zip       (~63 MB) β€” Lessac voice
    β”œβ”€β”€ threadcast-piper-en_US-ryan-medium-v1.zip         (~63 MB) β€” Ryan voice
    β”œβ”€β”€ threadcast-piper-en_US-hfc_female-medium-v1.zip   (~63 MB) β€” HFC Female voice
    β”œβ”€β”€ threadcast-piper-en_US-hfc_male-medium-v1.zip     (~63 MB) β€” HFC Male voice
    β”œβ”€β”€ threadcast-kitten-nano-en-v1.zip                  (~26 MB) β€” KittenTTS nano v0.1 fp16 (all 8 voices; "Local AI Plus")
    └── threadcast-kokoro-int8-en-v1.zip                  (~145 MB) β€” Kokoro int8 v0.19 (all 11 voices; "Local AI Studio")
```

**Versioning:** the `v1/` segment is part of the URL the runtime requests. Bumping to `v2/` lets future format changes ship without breaking older app builds β€” old apps keep pulling `v1/`, new apps pull `v2/`.

---

## What's inside each zip

The native [`AssetInstaller`](https://github.com/Pixel-Labs/Reddit-Reader/blob/main/packages/mobile/modules/threadcast-neural/android/src/main/java/app/threadcast/neural/AssetInstaller.kt) extracts each zip directly under `filesDir/sherpa-piper/` (Piper) or `filesDir/sherpa-kokoro/` (Kokoro), so the zip's internal layout = the on-device layout. No re-rooting, no per-file rules.

### Piper β€” shared espeak data

```
threadcast-piper-shared-v1.zip
└── espeak-ng-data/
    β”œβ”€β”€ phontab
    β”œβ”€β”€ phonindex
    β”œβ”€β”€ phondata
    β”œβ”€β”€ intonations
    β”œβ”€β”€ lang/
    β”œβ”€β”€ voices/
    └── … (full espeak-ng tree)
```

Downloaded once on the user's first Piper voice install. Skipped on every subsequent voice install.

### Piper β€” one zip per voice

```
threadcast-piper-en_US-amy-medium-v1.zip
└── en_US-amy-medium/
    β”œβ”€β”€ en_US-amy-medium.onnx        (~63 MB)
    └── tokens.txt
```

Five zips total β€” one per voice. Users only download the voices they want; selecting Amy doesn't pull Ryan.

### Kitten (Local AI Plus) β€” single bundle, all voices

```
threadcast-kitten-nano-en-v1.zip
β”œβ”€β”€ espeak-ng-data/                  (separate copy from Piper's / Kokoro's β€” different on-disk root)
β”‚   └── …
β”œβ”€β”€ model.fp16.onnx                  (~24 MB)
β”œβ”€β”€ voices.bin                       (~30 KB β€” 8 speaker embeddings)
└── tokens.txt
```

One download serves all 8 Plus voices β€” same style-vector-lookup pattern as Kokoro. The 8 speakers are baked into `voices.bin` in the order documented in `packages/mobile/modules/threadcast-neural/index.ts::KITTEN_VOICES`. **Never write the upstream engine codename in user-facing surfaces** β€” the Android UI labels this engine as "Local AI Plus" everywhere.

### Kokoro (Local AI Studio) β€” single bundle, all voices

```
threadcast-kokoro-int8-en-v1.zip
β”œβ”€β”€ espeak-ng-data/                  (separate copy from Piper's β€” different on-disk root)
β”‚   └── …
β”œβ”€β”€ model.int8.onnx                  (~135 MB)
β”œβ”€β”€ voices.bin                       (~5.7 MB β€” concatenated speaker embeddings, all 11)
└── tokens.txt
```

One download serves every Kokoro voice β€” switching speakers is a free style-vector lookup at synth time.

---

## How the runtime fetches these

**Mirror-and-fallback** β€” same pattern the extension uses for its `Pixel-Labs/threadcast-neural-models/{hf-cpu-mirror,hf-gpu-mirror}/` paths (see [`extension/src/offscreen/mirror-fetch.ts`](https://github.com/Pixel-Labs/Reddit-Reader/blob/main/packages/extension/src/offscreen/mirror-fetch.ts)). Each download is an ordered list of URLs; the installer tries each on failure and only surfaces an error if every URL is unreachable.

```
urls[0] = PRIMARY_BASE   = https://huggingface.co/Pixel-Labs/threadcast-neural-models/resolve/main/mobile-android/v1
urls[1] = FALLBACK_BASE  = (configurable β€” sibling HF repo, GitHub Release, private CDN, …)
```

Both bases need to serve the **same filenames** (the eight listed above). The fallback host is configured at app build time:

| Env var (set at build time) | Default | Purpose |
|---|---|---|
| `EXPO_PUBLIC_NEURAL_ASSETS_BASE_URL`     | `https://huggingface.co/Pixel-Labs/threadcast-neural-models/resolve/main/mobile-android/v1` | Primary mirror |
| `EXPO_PUBLIC_NEURAL_ASSETS_FALLBACK_URL` | unset | Optional fallback (omit for primary-only) |

The native installer:
- Tries `urls[0]` first. On HTTP 4xx/5xx or transport error, logs a warning and tries `urls[1]`.
- Only the LAST URL's failure surfaces to the UI as a download error β€” transient mirror outages are silent.
- Cancellation aborts immediately at any URL boundary.
- Verifies post-extract that every required file exists with non-zero size; rejects malformed zips even if the HTTP layer succeeded.

---

## Publishing workflow

1. **Produce the zips** from the local sherpa-onnx staging tree:

   ```sh
   pnpm --filter mobile produce:neural-zips
   ```

   Writes to `packages/mobile/dist/neural-assets/`. The script reads the locally-staged sherpa-onnx upstream artifacts (Piper voice ONNX files + KittenTTS-nano v0.1 fp16 bundle + Kokoro int8 v0.19 bundle + shared espeak-ng-data) and emits the eight zips with the correct internal layouts.

2. **Upload to Hugging Face** β€” the user-facing primary mirror lives at:

   <https://huggingface.co/Pixel-Labs/threadcast-neural-models/tree/main/mobile-android/v1>

   Drop all eight zips into that tree. HF preserves filenames as-is. No build steps, no metadata munging β€” they're just static assets behind a CDN.

3. **(Optional) Mirror to a fallback host.** Same eight filenames at any HTTPS endpoint. Common picks:
   - Sibling HF repo (`Pixel-Labs/threadcast-neural-models-mirror/v1/...`)
   - GitHub Release with `gh release upload v1 dist/neural-assets/*.zip`
   - Private CDN (R2, S3, etc.)

   Then ship the next app build with `EXPO_PUBLIC_NEURAL_ASSETS_FALLBACK_URL` set to the mirror's base URL.

---

## Per-engine download cost (user-facing)

| User intent | Files pulled | Network |
|---|---|---|
| Install **first** Local AI Lite voice (e.g. Amy) | `threadcast-piper-shared-v1.zip` + `threadcast-piper-en_US-amy-medium-v1.zip` | ~74 MB |
| Install **another** Local AI Lite voice (e.g. Lessac) | `threadcast-piper-en_US-lessac-medium-v1.zip` | ~63 MB |
| Install Local AI Plus | `threadcast-kitten-nano-en-v1.zip` | ~26 MB |
| Install Local AI Studio | `threadcast-kokoro-int8-en-v1.zip` | ~145 MB |
| Install every engine, all 5 Lite voices + Plus + Studio | every zip in `v1/` | ~496 MB |

The whole-bundle worst case is comparable to Spotify's "download an album for offline" workflow. Most users will pick one tier and stop there β€” Plus is the sweet spot at 26 MB for an 8-voice multi-speaker model.

---

## License

Per-project licenses retained from upstream β€” see the [parent README](../README.md#license) for the consolidated summary.