Update README.md
Browse files
README.md
CHANGED
|
@@ -12,6 +12,8 @@ You can run this model on 8x H100 80GB using vLLM with
|
|
| 12 |
|
| 13 |
`vllm serve adamo1139/DeepSeek-R1-0528-AWQ --tensor-parallel 8`
|
| 14 |
|
|
|
|
|
|
|
| 15 |
Script used for creating it is:
|
| 16 |
|
| 17 |
```
|
|
|
|
| 12 |
|
| 13 |
`vllm serve adamo1139/DeepSeek-R1-0528-AWQ --tensor-parallel 8`
|
| 14 |
|
| 15 |
+
If this doesn't work for you, you may need to manually specify quantization and datatype with `--quantization awq_marlin` and `--dtype float16` respectively.
|
| 16 |
+
|
| 17 |
Script used for creating it is:
|
| 18 |
|
| 19 |
```
|