fix vllm parameter's name

#6
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -163,8 +163,8 @@ VLLM_DISABLE_COMPILE_CACHE=1 vllm serve mistralai/Voxtral-Mini-4B-Realtime-2602
163
 
164
  Additional flags:
165
  * You can set `--max-num-batched-tokens` to balance throughput and latency, higher means higher throughput but higher latency.
166
- * You can reduce the default `--max-model-length` to allocate less memory for the pre-computed RoPE frequencies,
167
- if you are certain that you won't have to transcribe for more than X hours. By default the model uses a `--max-model-length` of 131072 (> 3h).
168
 
169
  #### Usage of the model
170
 
 
163
 
164
  Additional flags:
165
  * You can set `--max-num-batched-tokens` to balance throughput and latency, higher means higher throughput but higher latency.
166
+ * You can reduce the default `--max-model-len` to allocate less memory for the pre-computed RoPE frequencies,
167
+ if you are certain that you won't have to transcribe for more than X hours. By default the model uses a `--max-model-len` of 131072 (> 3h).
168
 
169
  #### Usage of the model
170