Update README.md
Browse files
README.md
CHANGED
|
@@ -31,6 +31,21 @@ To generate this weight, run the provided script in the ``./inference`` director
|
|
| 31 |
python3 bf16_cast_block_int8.py --input-bf16-hf-path /path/to/bf16-weights/ --output-int8-hf-path /path/to/save-int8-weight/
|
| 32 |
``
|
| 33 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 |
---
|
| 35 |
|
| 36 |
# DeepSeek-R1
|
|
|
|
| 31 |
python3 bf16_cast_block_int8.py --input-bf16-hf-path /path/to/bf16-weights/ --output-int8-hf-path /path/to/save-int8-weight/
|
| 32 |
``
|
| 33 |
|
| 34 |
+
## 3. Trouble Shooting
|
| 35 |
+
|
| 36 |
+
Before inference, you should confirm the "quantization_config" of `config.json` in `/path/to/save-int8-weight/` should be:
|
| 37 |
+
|
| 38 |
+
``
|
| 39 |
+
"quantization_config": {
|
| 40 |
+
"activation_scheme": "dynamic",
|
| 41 |
+
"quant_method": "blockwise_int8",
|
| 42 |
+
"weight_block_size": [
|
| 43 |
+
128,
|
| 44 |
+
128
|
| 45 |
+
]
|
| 46 |
+
}
|
| 47 |
+
``
|
| 48 |
+
|
| 49 |
---
|
| 50 |
|
| 51 |
# DeepSeek-R1
|