There is a 30-40% time spent difference while inference on the same russian audio with -l ru and auto language detection(no -l )
#11
by Limtech - opened
Does anybody know how it's possible and how can I get fastest inference on russian language without translating?
<<CUDA_VISIBLE_DEVICES="0" ./whisper-server -m ~/whisper.models/ggml-model.bin --max-context 0 --vad --vad-model ~/whisper.models/vad-ggml-silero-v5.1.2.bin -fa -l ru>> on the right screenshot and the same command but without -l ru on the left
Thanks