adamo1139
/

DeepSeek-R1-0528-AWQ

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions

adamo1139 commited on Jun 4, 2025

Commit

4819056

·

verified ·

1 Parent(s): 46540e9

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -12,6 +12,8 @@ You can run this model on 8x H100 80GB using vLLM with
 `vllm serve adamo1139/DeepSeek-R1-0528-AWQ --tensor-parallel 8`
 Script used for creating it is:
 ```

 `vllm serve adamo1139/DeepSeek-R1-0528-AWQ --tensor-parallel 8`
+If this doesn't work for you, you may need to manually specify quantization and datatype with `--quantization awq_marlin` and `--dtype float16` respectively.
 Script used for creating it is:
 ```