SenseVoiceSmall · GGUF (FunASR llama.cpp runtime)
GGUF build of SenseVoiceSmall (SAN-M encoder + CTC) for the zero-Python, CPU/edge FunASR llama.cpp runtime — multilingual ASR with language / emotion / event tags, ~20× real-time on CPU.
Files
| file | size | notes |
|---|---|---|
sensevoice-small-f16.gguf |
470 MB | recommended (f16 matmul weights) |
sensevoice-small.gguf |
936 MB | f32 reference |
Usage
The binary prints transcription text directly (no Python detok). --ids for raw ids / --keep-tags for the lang/emotion tags.
# 1. get the VAD too (for long audio): huggingface-cli download FunAudioLLM/fsmn-vad-GGUF
llama-funasr-sensevoice -m sensevoice-small-f16.gguf -a audio.wav --vad fsmn-vad.gguf
On CPU (8 threads) this reaches 8.01 % CER on the 184-clip Mandarin benchmark — vs whisper.cpp 22–31 %. See the benchmark.
Links
- 🧩 Runtime & build: SenseVoice · runtime/llama.cpp — ⭐ Star SenseVoice!
- Source model: FunAudioLLM/SenseVoiceSmall
- Downloads last month
- -
Hardware compatibility
Log In to add your hardware
16-bit