Plapre Pico CoreML

CoreML conversion of syvai/plapre-pico for on-device Danish text-to-speech on iOS 18+.

Models

Model	Description	Size
`PlaprePico.mlpackage`	LLM with stateful KV cache (fp16, ctx 512)	~238MB
`KanadeDecoder.mlpackage`	Audio tokens + speaker → mel spectrogram	~365MB
`Vocoder.mlpackage`	Mel → waveform (F0 + source gen + HiFT + iSTFT baked in)	~87MB
`PlaprePico_int8.mlpackage`	int8 quantized LLM (comparable quality)	~120MB

Performance (iPhone 15 / A16, CPU Only)

Config	Prefill	Decode	RTF
fp32, ctx 2048, naive Swift	14 tok/s	10 tok/s	2.5x
fp16 safe RMSNorm, ctx 512, optimized	60 tok/s	50 tok/s	~0.5x

2x realtime on iPhone 15 CPU. See TRIALS.md for the optimization journey.

Architecture

LLM: LlamaForCausalLM, 30 layers, hidden=576, 9 query / 3 KV heads (GQA), ~127M params
Vocab: 20,802 tokens — BPE text (0-8001) + Kanade audio codes (8002-20801)
Audio: Kanade codec at 25 tok/s, 24kHz output
Speakers: 5 built-in (tor, ida, liv, ask, kaj), 128-dim embeddings

Adaptations

fp16-safe RMSNorm: pre-scale by amax before squaring to prevent fp16 overflow at layer 4+
Custom attention: explicit matmul replacing SDPA, split-half RoPE with precomputed tables
KV cache: one-hot broadcast mask writes + MIL pass that injects coreml_update_state ops (torch.jit.trace doesn't emit prim::SetAttr)
Kanade: interleaved RoPE, local windowed attention, hardcoded dimensions
Vocoder: manual STFT/iSTFT via conv1d + matmul, DSP baked into one model

Usage

cd swift-cli
swift run plapre-cli "Hej, mit navn er Daniel."
# → output.wav

Conversion

pip install -r scripts/requirements.txt
python scripts/build.py                  # LLM + Kanade + Vocoder
python scripts/build.py --quantize int8  # also produce int8 LLM variant
python scripts/build.py --skip llm       # only rebuild audio models
python scripts/build.py --skip audio     # only rebuild LLM

Known Limitations

Compute units: .cpuOnly required — GPU/ANE crash with error -14
Streaming: chunked Kanade decoding works, but no real-time audio streaming yet

License

CC-BY-4.0, following the source model license.

Downloads last month: 9

Model tree for 42futures/plapre-pico-coreml

Base model

HuggingFaceTB/SmolLM2-135M

Quantized

syvai/plapre-pico

Quantized

(1)

this model