Instructions to use UsefulSensors/moonshine-streaming-tiny with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use UsefulSensors/moonshine-streaming-tiny with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="UsefulSensors/moonshine-streaming-tiny")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("UsefulSensors/moonshine-streaming-tiny", dtype="auto") - Notebooks
- Google Colab
- Kaggle
GGUF + pure-C++ runtime in CrispASR — Moonshine streaming
We've added the streaming Moonshine variants to CrispASR as the moonshine-streaming backend (separate from the offline moonshine backend because the encoder topology is different — sliding-window + raw-waveform frontend).
src/moonshine_streaming.cpp — same approach as the offline Moonshine impl: ggml graph for the sliding-window encoder, KV-cached autoregressive decoder. Companion tokenizer.bin auto-fetched.
This gives us a true low-latency streaming path in CrispASR (paired with --mic / --live and our standard VAD/diarisation post-step):
./build/bin/crispasr --backend moonshine-streaming \
-m moonshine-streaming-tiny-q4_k.gguf --mic
Pre-quantised GGUFs (MIT): cstr/moonshine-streaming-tiny-GGUF. Sibling sizes: -small (110M), -medium (245M).
(Offline Moonshine repos: tiny, base, plus ja/ko/zh/ar/vi/uk variants.)