linzhao-amd commited on
Commit
985c484
·
verified ·
1 Parent(s): 855b652

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -33,11 +33,12 @@ You can either perform the dequantization manually using this [conversion script
33
  **Quantization scripts:**
34
  ```
35
  cd Quark/examples/torch/language_modeling/llm_ptq/
 
36
  python3 quantize_quark.py --model_dir $MODEL_DIR \
37
  --quant_scheme w_mxfp4_a_mxfp4 \
38
  --group_size 32 \
39
  --num_calib_data 128 \
40
- --exclude_layers "*mlp.gate.*" "*lm_head" \
41
  --multi_gpu \
42
  --quant_algo autosmoothquant \
43
  --model_export hf_format \
 
33
  **Quantization scripts:**
34
  ```
35
  cd Quark/examples/torch/language_modeling/llm_ptq/
36
+
37
  python3 quantize_quark.py --model_dir $MODEL_DIR \
38
  --quant_scheme w_mxfp4_a_mxfp4 \
39
  --group_size 32 \
40
  --num_calib_data 128 \
41
+ --exclude_layers "*self_attn*" "*mlp.gate.*" "*lm_head" \
42
  --multi_gpu \
43
  --quant_algo autosmoothquant \
44
  --model_export hf_format \