Update README.md
Browse files
README.md
CHANGED
|
@@ -23,7 +23,7 @@ The accuracy of the model is surprisingly high, and has a decently fast inferenc
|
|
| 23 |
We have tested (and thus recommend) running this model on vLLM. We recommend running it from the vLLM openAI server, using the following command:
|
| 24 |
|
| 25 |
```bash
|
| 26 |
-
python -m vllm.entrypoints.openai.api_server --model lightblue/Mixtral-8x22B-v0.1
|
| 27 |
```
|
| 28 |
which is how we ran it on a 4 x A100 (80GB) machine.
|
| 29 |
|
|
|
|
| 23 |
We have tested (and thus recommend) running this model on vLLM. We recommend running it from the vLLM openAI server, using the following command:
|
| 24 |
|
| 25 |
```bash
|
| 26 |
+
python -m vllm.entrypoints.openai.api_server --model lightblue/Karasu-Mixtral-8x22B-v0.1 --tensor-parallel-size 4 --gpu-memory-utilization 0.95 --max-model-len 1024
|
| 27 |
```
|
| 28 |
which is how we ran it on a 4 x A100 (80GB) machine.
|
| 29 |
|