Benchmark: FunASR 170x realtime vs Whisper 13x — non-autoregressive alternative

#233

by langgz - opened May 24

May 24

For those looking for faster alternatives to Whisper, FunASR offers non-autoregressive models that are significantly faster:

Model	Speed (GPU)	Speed (CPU)	Languages
FunASR SenseVoice	170x realtime	17x realtime	5
FunASR Paraformer	120x realtime	15x realtime	2
Whisper-large-v3	13x realtime	❌	57

Key advantage: FunASR runs on CPU faster than Whisper runs on GPU.

Also includes built-in speaker diarization and emotion detection.

pip install funasr
from funasr import AutoModel
model = AutoModel(model="FunAudioLLM/SenseVoiceSmall", hub="hf", vad_model="funasr/fsmn-vad", device="cuda")
result = model.generate(input="audio.wav")

Benchmark details: https://modelscope.github.io/FunASR/benchmark.html

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment