meituan
/

DeepSeek-R1-Block-INT8

Text Generation

text-generation-inference

Model card Files Files and versions

HandH1998 commited on Feb 25, 2025

Commit

cdab70e

·

verified ·

1 Parent(s): e48a514

Update README.md

Files changed (1) hide show

README.md +15 -0

README.md CHANGED Viewed

@@ -31,6 +31,21 @@ To generate this weight, run the provided script in the ``./inference`` director
 python3 bf16_cast_block_int8.py --input-bf16-hf-path /path/to/bf16-weights/ --output-int8-hf-path /path/to/save-int8-weight/
 ``
 ---
 # DeepSeek-R1

 python3 bf16_cast_block_int8.py --input-bf16-hf-path /path/to/bf16-weights/ --output-int8-hf-path /path/to/save-int8-weight/
 ``
+## 3. Trouble Shooting
+Before inference, you should confirm the "quantization_config" of `config.json` in `/path/to/save-int8-weight/`  should be:
+``
+"quantization_config": {
+  "activation_scheme": "dynamic",
+  "quant_method": "blockwise_int8",
+  "weight_block_size": [
+    128,
+    128
+  ]
+}
+``
 ---
 # DeepSeek-R1