Update README.md
#7
by
SushantGautam
- opened
README.md
CHANGED
|
@@ -111,7 +111,7 @@ If you try the vLLM examples below and get an error about `quantization` being u
|
|
| 111 |
- When using vLLM as a server, pass the `--quantization awq` parameter, for example:
|
| 112 |
|
| 113 |
```shell
|
| 114 |
-
python3
|
| 115 |
```
|
| 116 |
|
| 117 |
When using vLLM from Python code, pass the `quantization=awq` parameter, for example:
|
|
|
|
| 111 |
- When using vLLM as a server, pass the `--quantization awq` parameter, for example:
|
| 112 |
|
| 113 |
```shell
|
| 114 |
+
python3 -m vllm.entrypoints.api_server --model TheBloke/Mistral-7B-OpenOrca-AWQ --quantization awq --dtype half
|
| 115 |
```
|
| 116 |
|
| 117 |
When using vLLM from Python code, pass the `quantization=awq` parameter, for example:
|