Benchmark: FunASR 170x realtime vs Whisper 13x — non-autoregressive alternative

#233
by langgz - opened

For those looking for faster alternatives to Whisper, FunASR offers non-autoregressive models that are significantly faster:

Model Speed (GPU) Speed (CPU) Languages
FunASR SenseVoice 170x realtime 17x realtime 5
FunASR Paraformer 120x realtime 15x realtime 2
Whisper-large-v3 13x realtime 57

Key advantage: FunASR runs on CPU faster than Whisper runs on GPU.

Also includes built-in speaker diarization and emotion detection.

pip install funasr
from funasr import AutoModel
model = AutoModel(model="FunAudioLLM/SenseVoiceSmall", hub="hf", vad_model="funasr/fsmn-vad", device="cuda")
result = model.generate(input="audio.wav")

Benchmark details: https://modelscope.github.io/FunASR/benchmark.html

Sign up or log in to comment