xuebi
commited on
Commit
·
a423be8
1
Parent(s):
2e162cb
udpate: guide use nightly vllm
Browse files
docs/vllm_deploy_guide.md
CHANGED
|
@@ -39,7 +39,7 @@ We recommend installing vLLM in a fresh Python environment:
|
|
| 39 |
```bash
|
| 40 |
uv venv
|
| 41 |
source .venv/bin/activate
|
| 42 |
-
uv pip install -U vllm
|
| 43 |
```
|
| 44 |
|
| 45 |
Run the following command to start the vLLM server. vLLM will automatically download and cache the MiniMax-M2.1 model from Hugging Face.
|
|
@@ -98,6 +98,10 @@ SAFETENSORS_FAST_GPU=1 vllm serve \
|
|
| 98 |
--compilation-config "{\"cudagraph_mode\": \"PIECEWISE\"}"
|
| 99 |
```
|
| 100 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 101 |
## Getting Support
|
| 102 |
|
| 103 |
If you encounter any issues while deploying the MiniMax model:
|
|
|
|
| 39 |
```bash
|
| 40 |
uv venv
|
| 41 |
source .venv/bin/activate
|
| 42 |
+
uv pip install -U vllm --extra-index-url https://wheels.vllm.ai/nightly
|
| 43 |
```
|
| 44 |
|
| 45 |
Run the following command to start the vLLM server. vLLM will automatically download and cache the MiniMax-M2.1 model from Hugging Face.
|
|
|
|
| 98 |
--compilation-config "{\"cudagraph_mode\": \"PIECEWISE\"}"
|
| 99 |
```
|
| 100 |
|
| 101 |
+
### Output is garbled
|
| 102 |
+
|
| 103 |
+
If you encounter corrupted output when using vLLM to serve these models, you can upgrade to the nightly version (ensure it is a version after commit [cf3eacfe58fa9e745c2854782ada884a9f992cf7](https://github.com/vllm-project/vllm/commit/cf3eacfe58fa9e745c2854782ada884a9f992cf7))
|
| 104 |
+
|
| 105 |
## Getting Support
|
| 106 |
|
| 107 |
If you encounter any issues while deploying the MiniMax model:
|
docs/vllm_deploy_guide_cn.md
CHANGED
|
@@ -39,7 +39,7 @@
|
|
| 39 |
```bash
|
| 40 |
uv venv
|
| 41 |
source .venv/bin/activate
|
| 42 |
-
uv pip install -U vllm
|
| 43 |
```
|
| 44 |
|
| 45 |
运行如下命令启动 vLLM 服务器,vLLM 会自动从 Huggingface 下载并缓存 MiniMax-M2.1 模型。
|
|
@@ -106,6 +106,10 @@ SAFETENSORS_FAST_GPU=1 vllm serve \
|
|
| 106 |
--compilation-config "{\"cudagraph_mode\": \"PIECEWISE\"}"
|
| 107 |
```
|
| 108 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 109 |
## 获取支持
|
| 110 |
|
| 111 |
如果在部署 MiniMax 模型过程中遇到任何问题:
|
|
|
|
| 39 |
```bash
|
| 40 |
uv venv
|
| 41 |
source .venv/bin/activate
|
| 42 |
+
uv pip install -U vllm --extra-index-url https://wheels.vllm.ai/nightly
|
| 43 |
```
|
| 44 |
|
| 45 |
运行如下命令启动 vLLM 服务器,vLLM 会自动从 Huggingface 下载并缓存 MiniMax-M2.1 模型。
|
|
|
|
| 106 |
--compilation-config "{\"cudagraph_mode\": \"PIECEWISE\"}"
|
| 107 |
```
|
| 108 |
|
| 109 |
+
### 模型输出乱码
|
| 110 |
+
|
| 111 |
+
如果您在使用 vLLM 运行这些模型时遇到输出乱码,可以升级到最新版本(请至少确保版本在提交 [cf3eacfe58fa9e745c2854782ada884a9f992cf7](https://github.com/vllm-project/vllm/commit/cf3eacfe58fa9e745c2854782ada884a9f992cf7) 之后)。
|
| 112 |
+
|
| 113 |
## 获取支持
|
| 114 |
|
| 115 |
如果在部署 MiniMax 模型过程中遇到任何问题:
|