Update README.md
Browse files
README.md
CHANGED
|
@@ -134,7 +134,7 @@ See [its documentation](https://docs.vllm.ai/en/stable/getting_started/installat
|
|
| 134 |
|
| 135 |
The following command can be used to create an API endpoint at `http://localhost:8000/v1` with maximum context length 256K tokens using tensor parallel on 4 GPUs.
|
| 136 |
```shell
|
| 137 |
-
vllm serve Qwen/Qwen3-Coder-Next
|
| 138 |
```
|
| 139 |
|
| 140 |
> [!Note]
|
|
|
|
| 134 |
|
| 135 |
The following command can be used to create an API endpoint at `http://localhost:8000/v1` with maximum context length 256K tokens using tensor parallel on 4 GPUs.
|
| 136 |
```shell
|
| 137 |
+
vllm serve Qwen/Qwen3-Coder-Next --port 8000 --tensor-parallel-size 2 --enable-auto-tool-choice --tool-call-parser qwen3_coder
|
| 138 |
```
|
| 139 |
|
| 140 |
> [!Note]
|