xuebi commited on
Commit
a423be8
·
1 Parent(s): 2e162cb

udpate: guide use nightly vllm

Browse files
docs/vllm_deploy_guide.md CHANGED
@@ -39,7 +39,7 @@ We recommend installing vLLM in a fresh Python environment:
39
  ```bash
40
  uv venv
41
  source .venv/bin/activate
42
- uv pip install -U vllm
43
  ```
44
 
45
  Run the following command to start the vLLM server. vLLM will automatically download and cache the MiniMax-M2.1 model from Hugging Face.
@@ -98,6 +98,10 @@ SAFETENSORS_FAST_GPU=1 vllm serve \
98
  --compilation-config "{\"cudagraph_mode\": \"PIECEWISE\"}"
99
  ```
100
 
 
 
 
 
101
  ## Getting Support
102
 
103
  If you encounter any issues while deploying the MiniMax model:
 
39
  ```bash
40
  uv venv
41
  source .venv/bin/activate
42
+ uv pip install -U vllm --extra-index-url https://wheels.vllm.ai/nightly
43
  ```
44
 
45
  Run the following command to start the vLLM server. vLLM will automatically download and cache the MiniMax-M2.1 model from Hugging Face.
 
98
  --compilation-config "{\"cudagraph_mode\": \"PIECEWISE\"}"
99
  ```
100
 
101
+ ### Output is garbled
102
+
103
+ If you encounter corrupted output when using vLLM to serve these models, you can upgrade to the nightly version (ensure it is a version after commit [cf3eacfe58fa9e745c2854782ada884a9f992cf7](https://github.com/vllm-project/vllm/commit/cf3eacfe58fa9e745c2854782ada884a9f992cf7))
104
+
105
  ## Getting Support
106
 
107
  If you encounter any issues while deploying the MiniMax model:
docs/vllm_deploy_guide_cn.md CHANGED
@@ -39,7 +39,7 @@
39
  ```bash
40
  uv venv
41
  source .venv/bin/activate
42
- uv pip install -U vllm
43
  ```
44
 
45
  运行如下命令启动 vLLM 服务器,vLLM 会自动从 Huggingface 下载并缓存 MiniMax-M2.1 模型。
@@ -106,6 +106,10 @@ SAFETENSORS_FAST_GPU=1 vllm serve \
106
  --compilation-config "{\"cudagraph_mode\": \"PIECEWISE\"}"
107
  ```
108
 
 
 
 
 
109
  ## 获取支持
110
 
111
  如果在部署 MiniMax 模型过程中遇到任何问题:
 
39
  ```bash
40
  uv venv
41
  source .venv/bin/activate
42
+ uv pip install -U vllm --extra-index-url https://wheels.vllm.ai/nightly
43
  ```
44
 
45
  运行如下命令启动 vLLM 服务器,vLLM 会自动从 Huggingface 下载并缓存 MiniMax-M2.1 模型。
 
106
  --compilation-config "{\"cudagraph_mode\": \"PIECEWISE\"}"
107
  ```
108
 
109
+ ### 模型输出乱码
110
+
111
+ 如果您在使用 vLLM 运行这些模型时遇到输出乱码,可以升级到最新版本(请至少确保版本在提交 [cf3eacfe58fa9e745c2854782ada884a9f992cf7](https://github.com/vllm-project/vllm/commit/cf3eacfe58fa9e745c2854782ada884a9f992cf7) 之后)。
112
+
113
  ## 获取支持
114
 
115
  如果在部署 MiniMax 模型过程中遇到任何问题: