Parakeet TDT-CTC 0.6B Japanese โ Core ML (Streaming ASR)
Core ML conversion of nvidia/parakeet-tdt_ctc-0.6b-ja for SlidingWindow streaming ASR on iOS/macOS (TranslateBlue / FluidAudio).
Streaming ASR contract
This repo is not a batch-only file transcription bundle. It targets live microphone ASR:
| Aspect | Specification |
|---|---|
| Architecture | SlidingWindow pseudo-streaming (stateless 15s encoder + overlapping windows) |
| Runtime config | SlidingWindowAsrConfig.streaming โ 11s chunk, 2s left/right context |
| Decoder | Decoderv2 with U=1 LSTM state I/O (state carried across windows) |
| Jointer | Jointerv2 โ TDT duration bins (max 4 frames for ja) |
| Encoder window | Fixed 15s mel input ([1, 80, 1501]). Short windows (e.g. 5s) are unsupported. |
| Vocab | 3072 BPE tokens, blank id 3072 |
| CTC | CtcDecoder.mlpackage is tier-2 failover only (raw logits, no log_softmax) |
Not included: cache-aware true streaming (Parakeet EOU / Nemotron), fused FullPipeline / MelEncoder batch paths.
Artifacts (aoiandroid lowercase layout)
| File | FluidAudio name | Role |
|---|---|---|
preprocessor.mlpackage |
Preprocessor |
16 kHz mono โ mel |
encoder.mlpackage |
Encoder |
mel โ encoder_output |
decoder.mlpackage |
Decoderv2 |
LSTM decoder with state |
joint.mlpackage |
Jointerv2 |
TDT joint step |
vocab_ja.json |
vocab.json |
SentencePiece vocabulary |
CtcDecoder.mlpackage |
CtcDecoder |
CTC tier-2 failover |
Conversion
Pipeline: FluidInference/mobius
Script: models/stt/parakeet-ctc-0.6b-ja/coreml/conversion/export-tdt-ja-streaming.py
cd mobius/models/stt/parakeet-ctc-0.6b-ja/coreml
uv sync
uv run python conversion/export-tdt-ja-streaming.py --output-dir ./build
uv run python conversion/export-full-pipeline.py --output-dir ./build --no-fused
Validation
- Streaming:
fluidaudiocli transcribe <wav> --streaming --model-version tdt-ja --model-dir ./build - Accuracy: JSUT TDT CER via
fluidaudiocli ja-benchmark --dataset jsut --samples 500
License
Converted weights follow the upstream NVIDIA model license. Model card metadata: CC-BY-4.0.
- Downloads last month
- 89