docs: update README.md
#1
by
quocbao747
- opened
README.md
CHANGED
|
@@ -136,7 +136,7 @@ pip install 'vllm>=0.15.0'
|
|
| 136 |
```
|
| 137 |
See [its documentation](https://docs.vllm.ai/en/stable/getting_started/installation/index.html) for more details.
|
| 138 |
|
| 139 |
-
The following command can be used to create an API endpoint at `http://localhost:8000/v1` with maximum context length 256K tokens using tensor parallel on
|
| 140 |
```shell
|
| 141 |
vllm serve Qwen/Qwen3-Coder-Next-FP8 --port 8000 --tensor-parallel-size 2 --enable-auto-tool-choice --tool-call-parser qwen3_coder
|
| 142 |
```
|
|
|
|
| 136 |
```
|
| 137 |
See [its documentation](https://docs.vllm.ai/en/stable/getting_started/installation/index.html) for more details.
|
| 138 |
|
| 139 |
+
The following command can be used to create an API endpoint at `http://localhost:8000/v1` with maximum context length 256K tokens using tensor parallel on 2 GPUs.
|
| 140 |
```shell
|
| 141 |
vllm serve Qwen/Qwen3-Coder-Next-FP8 --port 8000 --tensor-parallel-size 2 --enable-auto-tool-choice --tool-call-parser qwen3_coder
|
| 142 |
```
|