Pixel-Labs's picture
Upload README.md
380bf54 verified

ThreadCast β€” Android Production Zips

Distributed mirror for the Android app's neural TTS assets. The eight zips here are downloaded by the app at runtime β€” first install of a neural engine pulls only what's needed (~74 MB for one Piper voice + shared data, ~26 MB for the Plus bundle, or ~145 MB for the full Kokoro bundle), and the user can manage each model individually from inside the app.

Sibling: ../extension/ holds the Chrome extension's neural models in raw HF format. Both subtrees share the same engine families (Piper VITS, KittenTTS-nano, Kokoro StyleTTS2) β€” only the on-disk packaging differs.


Layout

mobile-android/
└── v1/
    β”œβ”€β”€ threadcast-piper-shared-v1.zip                    (~11 MB) β€” espeak phonemizer data, downloaded once
    β”œβ”€β”€ threadcast-piper-en_US-amy-medium-v1.zip          (~63 MB) β€” Amy voice
    β”œβ”€β”€ threadcast-piper-en_US-lessac-medium-v1.zip       (~63 MB) β€” Lessac voice
    β”œβ”€β”€ threadcast-piper-en_US-ryan-medium-v1.zip         (~63 MB) β€” Ryan voice
    β”œβ”€β”€ threadcast-piper-en_US-hfc_female-medium-v1.zip   (~63 MB) β€” HFC Female voice
    β”œβ”€β”€ threadcast-piper-en_US-hfc_male-medium-v1.zip     (~63 MB) β€” HFC Male voice
    β”œβ”€β”€ threadcast-kitten-nano-en-v1.zip                  (~26 MB) β€” KittenTTS nano v0.1 fp16 (all 8 voices; "Local AI Plus")
    └── threadcast-kokoro-int8-en-v1.zip                  (~145 MB) β€” Kokoro int8 v0.19 (all 11 voices; "Local AI Studio")

Versioning: the v1/ segment is part of the URL the runtime requests. Bumping to v2/ lets future format changes ship without breaking older app builds β€” old apps keep pulling v1/, new apps pull v2/.


What's inside each zip

The native AssetInstaller extracts each zip directly under filesDir/sherpa-piper/ (Piper) or filesDir/sherpa-kokoro/ (Kokoro), so the zip's internal layout = the on-device layout. No re-rooting, no per-file rules.

Piper β€” shared espeak data

threadcast-piper-shared-v1.zip
└── espeak-ng-data/
    β”œβ”€β”€ phontab
    β”œβ”€β”€ phonindex
    β”œβ”€β”€ phondata
    β”œβ”€β”€ intonations
    β”œβ”€β”€ lang/
    β”œβ”€β”€ voices/
    └── … (full espeak-ng tree)

Downloaded once on the user's first Piper voice install. Skipped on every subsequent voice install.

Piper β€” one zip per voice

threadcast-piper-en_US-amy-medium-v1.zip
└── en_US-amy-medium/
    β”œβ”€β”€ en_US-amy-medium.onnx        (~63 MB)
    └── tokens.txt

Five zips total β€” one per voice. Users only download the voices they want; selecting Amy doesn't pull Ryan.

Kitten (Local AI Plus) β€” single bundle, all voices

threadcast-kitten-nano-en-v1.zip
β”œβ”€β”€ espeak-ng-data/                  (separate copy from Piper's / Kokoro's β€” different on-disk root)
β”‚   └── …
β”œβ”€β”€ model.fp16.onnx                  (~24 MB)
β”œβ”€β”€ voices.bin                       (~30 KB β€” 8 speaker embeddings)
└── tokens.txt

One download serves all 8 Plus voices β€” same style-vector-lookup pattern as Kokoro. The 8 speakers are baked into voices.bin in the order documented in packages/mobile/modules/threadcast-neural/index.ts::KITTEN_VOICES. Never write the upstream engine codename in user-facing surfaces β€” the Android UI labels this engine as "Local AI Plus" everywhere.

Kokoro (Local AI Studio) β€” single bundle, all voices

threadcast-kokoro-int8-en-v1.zip
β”œβ”€β”€ espeak-ng-data/                  (separate copy from Piper's β€” different on-disk root)
β”‚   └── …
β”œβ”€β”€ model.int8.onnx                  (~135 MB)
β”œβ”€β”€ voices.bin                       (~5.7 MB β€” concatenated speaker embeddings, all 11)
└── tokens.txt

One download serves every Kokoro voice β€” switching speakers is a free style-vector lookup at synth time.


How the runtime fetches these

Mirror-and-fallback β€” same pattern the extension uses for its Pixel-Labs/threadcast-neural-models/{hf-cpu-mirror,hf-gpu-mirror}/ paths (see extension/src/offscreen/mirror-fetch.ts). Each download is an ordered list of URLs; the installer tries each on failure and only surfaces an error if every URL is unreachable.

urls[0] = PRIMARY_BASE   = https://huggingface.co/Pixel-Labs/threadcast-neural-models/resolve/main/mobile-android/v1
urls[1] = FALLBACK_BASE  = (configurable β€” sibling HF repo, GitHub Release, private CDN, …)

Both bases need to serve the same filenames (the eight listed above). The fallback host is configured at app build time:

Env var (set at build time) Default Purpose
EXPO_PUBLIC_NEURAL_ASSETS_BASE_URL https://huggingface.co/Pixel-Labs/threadcast-neural-models/resolve/main/mobile-android/v1 Primary mirror
EXPO_PUBLIC_NEURAL_ASSETS_FALLBACK_URL unset Optional fallback (omit for primary-only)

The native installer:

  • Tries urls[0] first. On HTTP 4xx/5xx or transport error, logs a warning and tries urls[1].
  • Only the LAST URL's failure surfaces to the UI as a download error β€” transient mirror outages are silent.
  • Cancellation aborts immediately at any URL boundary.
  • Verifies post-extract that every required file exists with non-zero size; rejects malformed zips even if the HTTP layer succeeded.

Publishing workflow

  1. Produce the zips from the local sherpa-onnx staging tree:

    pnpm --filter mobile produce:neural-zips
    

    Writes to packages/mobile/dist/neural-assets/. The script reads the locally-staged sherpa-onnx upstream artifacts (Piper voice ONNX files + KittenTTS-nano v0.1 fp16 bundle + Kokoro int8 v0.19 bundle + shared espeak-ng-data) and emits the eight zips with the correct internal layouts.

  2. Upload to Hugging Face β€” the user-facing primary mirror lives at:

    https://huggingface.co/Pixel-Labs/threadcast-neural-models/tree/main/mobile-android/v1

    Drop all eight zips into that tree. HF preserves filenames as-is. No build steps, no metadata munging β€” they're just static assets behind a CDN.

  3. (Optional) Mirror to a fallback host. Same eight filenames at any HTTPS endpoint. Common picks:

    • Sibling HF repo (Pixel-Labs/threadcast-neural-models-mirror/v1/...)
    • GitHub Release with gh release upload v1 dist/neural-assets/*.zip
    • Private CDN (R2, S3, etc.)

    Then ship the next app build with EXPO_PUBLIC_NEURAL_ASSETS_FALLBACK_URL set to the mirror's base URL.


Per-engine download cost (user-facing)

User intent Files pulled Network
Install first Local AI Lite voice (e.g. Amy) threadcast-piper-shared-v1.zip + threadcast-piper-en_US-amy-medium-v1.zip ~74 MB
Install another Local AI Lite voice (e.g. Lessac) threadcast-piper-en_US-lessac-medium-v1.zip ~63 MB
Install Local AI Plus threadcast-kitten-nano-en-v1.zip ~26 MB
Install Local AI Studio threadcast-kokoro-int8-en-v1.zip ~145 MB
Install every engine, all 5 Lite voices + Plus + Studio every zip in v1/ ~496 MB

The whole-bundle worst case is comparable to Spotify's "download an album for offline" workflow. Most users will pick one tier and stop there β€” Plus is the sweet spot at 26 MB for an 8-voice multi-speaker model.


License

Per-project licenses retained from upstream β€” see the parent README for the consolidated summary.