ThreadCast β Neural Models Mirror
Threads, now a podcast.
threadcast.app Β· pixellabs.ventures
Self-hosted mirror of the on-device neural TTS models used by ThreadCast, the Chrome extension that turns any Reddit thread into a hands-free podcast.
This repository exists so the extension can ship a stable, version-pinned set of model weights without depending on the availability or rate-limits of upstream Hugging Face repos at runtime.
Note: if you're a ThreadCast user, you don't need anything here β the extension downloads what it needs automatically the first time you select a Neural engine. This page is for transparency, contributors, and forks.
Repository layout
threadcast-neural-models/
βββ hf-cpu-mirror/ # Piper voices for the CPU engine
β βββ en/en_US/<voice>/medium/
β βββ en_US-<voice>-medium.onnx
β βββ en_US-<voice>-medium.onnx.json
βββ hf-gpu-mirror/ # Kokoro model + voices for the GPU engine
βββ onnx/
β βββ model.onnx # fp32 β production default
β βββ model_fp16.onnx # fp16 β experimental, blocked by upstream bugs
βββ tokenizer.json
βββ tokenizer_config.json
βββ config.json
βββ voices/ # 11 speaker embeddings
βββ af_bella.bin β¦ bm_daniel.bin
CPU tier β Piper (VITS Β· 28 M params Β· WASM)
Five English voices, ~63 MB per voice. One voice loaded at a time. Single-thread WASM inference inside an MV3 offscreen document. Real-time on a modern laptop.
| Voice ID | Speaker | Notes |
|---|---|---|
en_US-amy-medium |
Amy | Female Β· warm narrator |
en_US-lessac-medium |
Lessac | Female Β· neutral, news-anchor |
en_US-ryan-medium |
Ryan | Male Β· clear, newsreader |
en_US-hfc_female-medium |
HFC Female | Female Β· crisp, modern |
en_US-hfc_male-medium |
HFC Male | Male Β· crisp, modern |
Each voice ships as two files (*.onnx + *.onnx.json) under hf-cpu-mirror/en/en_US/<voice>/medium/.
Upstream: diffusionstudio/piper-voices β curated subset mirrored here.
GPU tier β Kokoro 82 M (ONNX Β· WebGPU)
A single Kokoro model unlocks 11 distinct voices at once via 11 small speaker-embedding files. WebGPU-accelerated inference, ~10Γ real-time on a modern GPU.
Model file
| File | Precision | Size | Status |
|---|---|---|---|
hf-gpu-mirror/onnx/model.onnx |
fp32 | ~325 MB | β Production default β stable on every WebGPU runtime |
hf-gpu-mirror/onnx/model_fp16.onnx |
fp16 | ~165 MB | β οΈ Reserved for future use β blocked today by upstream onnxruntime-web fp16 bugs (microsoft/onnxruntime#23403, #26732) |
The fp16 file is staged here so once the upstream JS stack lands fp16+WebGPU fixes, ThreadCast can flip the default to fp16 with a single config change β halving the download and roughly doubling per-segment speed on capable GPUs.
Tokenizer + config
tokenizer.json, tokenizer_config.json, config.json β small files used by @huggingface/transformers (transformers.js) when loading the model.
Voices (hf-gpu-mirror/voices/*.bin, ~520 KB each)
| Voice ID | Name | Accent | Gender |
|---|---|---|---|
af_bella |
Bella | American | Female |
af_sarah |
Sarah | American | Female |
af_nova |
Nova | American | Female |
af_sky |
Sky | American | Female |
am_adam |
Adam | American | Male |
am_michael |
Michael | American | Male |
am_echo |
Echo | American | Male |
bf_emma |
Emma | British | Female |
bf_isabella |
Isabella | British | Female |
bm_george |
George | British | Male |
bm_daniel |
Daniel | British | Male |
Voice IDs encode locale and gender: first letter = accent (a = American, b = British), second letter = gender (f = female, m = male).
Upstream: model from onnx-community/Kokoro-82M-v1.0-ONNX-timestamped; voice embeddings from onnx-community/Kokoro-82M-v1.0-ONNX.
How the extension uses these files
The ThreadCast extension fetches model files lazily, only when the user selects a Neural engine and presses Test/Play. Files are cached in the browser's Cache API and reused across sessions, so the user pays the download cost exactly once per profile.
| Engine | Files fetched on first use |
|---|---|
| System voices | None β uses OS / browser TTS |
| Neural Β· CPU | The selected voice's .onnx + .onnx.json (~63 MB total) |
| Neural Β· GPU | onnx/model.onnx + tokenizer (.bin ( |
The WASM runtimes (ONNX Runtime, Piper phonemizer) are bundled inside the extension package itself β not served from this repo β to comply with Manifest V3 CSP and avoid CDN dependencies.
License
This repository mirrors upstream models for distribution stability. Each upstream project retains its own license:
- Kokoro-82M: Apache-2.0 (upstream model card)
- Piper voices: MIT, with individual voice attributions in each
.onnx.json - transformers.js, onnxruntime-web: Apache-2.0
The mirror layout, README, and any custom additions in this repository are licensed under MIT by Pixel Labs.
Links
- π ThreadCast: threadcast.app
- π§βπ» Pixel Labs: pixellabs.ventures
- π¦ Issues / questions: open an issue on the ThreadCast extension repo