mistralai
/

Voxtral-Mini-4B-Realtime-2602

Automatic Speech Recognition

voxtral_realtime

Model card Files Files and versions

fix vllm parameter's name

#6

by sebag90 - opened Feb 5

base: refs/heads/main

←

from: refs/pr/6

Discussion Files changed

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -163,8 +163,8 @@ VLLM_DISABLE_COMPILE_CACHE=1 vllm serve mistralai/Voxtral-Mini-4B-Realtime-2602
 Additional flags:
 * You can set `--max-num-batched-tokens` to balance throughput and latency, higher means higher throughput but higher latency.
-* You can reduce the default `--max-model-length` to allocate less memory for the pre-computed RoPE frequencies,
-  if you are certain that you won't have to transcribe for more than X hours. By default the model uses a `--max-model-length` of 131072 (> 3h).
 #### Usage of the model

 Additional flags:
 * You can set `--max-num-batched-tokens` to balance throughput and latency, higher means higher throughput but higher latency.
+* You can reduce the default `--max-model-len` to allocate less memory for the pre-computed RoPE frequencies,
+  if you are certain that you won't have to transcribe for more than X hours. By default the model uses a `--max-model-len` of 131072 (> 3h).
 #### Usage of the model