Paraformer-zh · GGUF (FunASR llama.cpp runtime)
GGUF build of FunASR's Paraformer-zh (SAN-M encoder + CIF predictor + SAN-M decoder, non-autoregressive) for the zero-Python, CPU/edge FunASR llama.cpp runtime — fast Mandarin ASR, ~21× real-time on CPU.
Files
| file | size | notes |
|---|---|---|
paraformer-f16.gguf |
435 MB | recommended (f16 matmul weights) |
paraformer.gguf |
863 MB | f32 reference |
Usage
The binary prints transcription text directly (no Python detok). --ids for raw ids.
llama-funasr-paraformer -m paraformer-f16.gguf -a audio.wav --vad fsmn-vad.gguf
On CPU (8 threads): 9.85 % CER on the 184-clip Mandarin benchmark (vs whisper.cpp 22–31 %).
Links
- 🧩 Runtime & build: FunASR · runtime/llama.cpp — ⭐ Star FunASR!
- Source model: funasr/paraformer-zh
- Downloads last month
- -
Hardware compatibility
Log In to add your hardware
16-bit