There is a 30-40% time spent difference while inference on the same russian audio with -l ru and auto language detection(no -l )

#11
by Limtech - opened

Does anybody know how it's possible and how can I get fastest inference on russian language without translating?
<<CUDA_VISIBLE_DEVICES="0" ./whisper-server -m ~/whisper.models/ggml-model.bin --max-context 0 --vad --vad-model ~/whisper.models/vad-ggml-silero-v5.1.2.bin -fa -l ru>> on the right screenshot and the same command but without -l ru on the left
Thanks
Screenshot_2025-10-30_17-38-08

Sign up or log in to comment