docs: fix vLLM tensor parallel flag in deploy guide

The vLLM CLI uses --tensor-parallel-size (or -tp) for tensor parallelism, not --tp. The current example in deploy_guidance.md fails with unrecognized arguments: --tp. This PR updates only the vLLM command example; the SGLang example remains unchanged.

Files changed (1) hide show

docs/deploy_guidance.md +1 -1

docs/deploy_guidance.md CHANGED Viewed

@@ -15,7 +15,7 @@ uv pip install -U vllm \
 Here is the example to serve this model on a H200 single node with TP8 via vLLM:
 ```bash
-vllm serve $MODEL_PATH --tp 8 --trust-remote-code --tool-call-parser kimi_k2 --reasoning-parser kimi_k2
 ```
 **Key notes**
 - `--tool-call-parser kimi_k2`: Required for enabling tool calling

 Here is the example to serve this model on a H200 single node with TP8 via vLLM:
 ```bash
+vllm serve $MODEL_PATH -tp 8 --trust-remote-code --tool-call-parser kimi_k2 --reasoning-parser kimi_k2
 ```
 **Key notes**
 - `--tool-call-parser kimi_k2`: Required for enabling tool calling