TheBloke
/

Mistral-7B-OpenOrca-AWQ

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions

Update README.md

#7

by SushantGautam - opened Feb 1, 2024

base: refs/heads/main

←

from: refs/pr/7

Discussion Files changed

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -111,7 +111,7 @@ If you try the vLLM examples below and get an error about `quantization` being u
 - When using vLLM as a server, pass the `--quantization awq` parameter, for example:
 ```shell
-python3 python -m vllm.entrypoints.api_server --model TheBloke/Mistral-7B-OpenOrca-AWQ --quantization awq --dtype half
 ```
 When using vLLM from Python code, pass the `quantization=awq` parameter, for example:

 - When using vLLM as a server, pass the `--quantization awq` parameter, for example:
 ```shell
+python3 -m vllm.entrypoints.api_server --model TheBloke/Mistral-7B-OpenOrca-AWQ --quantization awq --dtype half
 ```
 When using vLLM from Python code, pass the `quantization=awq` parameter, for example: