amd
/

DeepSeek-R1-0528-MXFP4-MTP-MoEFP4

8-bit precision

Model card Files Files and versions

haoyang-amd commited on 25 days ago

Commit

4d8af73

·

verified ·

1 Parent(s): 96f621b

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -40,13 +40,13 @@ You can either perform the dequantization manually using this [conversion script
 ```
 cd Quark/examples/torch/language_modeling/llm_ptq/
 export exclude_layers="*mlp.gate.* *lm_head model.layers.61.eh_proj model.layers.61.shared_head.head model.layers.61.embed_tokens"
-python3 quantize_quark.py --model_dir /shareddata/amd/DeepSeek-R1-0528-BF16 \
                           --quant_scheme mxfp4 \
                           --layer_quant_scheme '*self_attn*' ptpc_fp8 \
                           --exclude_layers $exclude_layers \
                           --skip_evaluation \
                           --model_export hf_format \
-                          --output_dir /shareddata/amd/DeepSeek-R1-0528-MoE-MTP-MXFP4-Attn-PTPC-FP8 \
                           --multi_gpu
 ```
@@ -58,7 +58,7 @@ python3 quantize_quark.py --model_dir /shareddata/amd/DeepSeek-R1-0528-BF16 \
    </td>
    <td><strong>DeepSeek-R1-0528</strong>
    </td>
-   <td><strong>DeepSeek-R1-0528-MTP-MoE-MXFP4-Attn-PTPC-FP8(this model)</strong>
    </td>
   </tr>
   <tr>

 ```
 cd Quark/examples/torch/language_modeling/llm_ptq/
 export exclude_layers="*mlp.gate.* *lm_head model.layers.61.eh_proj model.layers.61.shared_head.head model.layers.61.embed_tokens"
+python3 quantize_quark.py --model_dir /amd/DeepSeek-R1-0528-BF16 \
                           --quant_scheme mxfp4 \
                           --layer_quant_scheme '*self_attn*' ptpc_fp8 \
                           --exclude_layers $exclude_layers \
                           --skip_evaluation \
                           --model_export hf_format \
+                          --output_dir amd/DeepSeek-R1-0528-MXFP4-MTP-MoEFP48 \
                           --multi_gpu
 ```
    </td>
    <td><strong>DeepSeek-R1-0528</strong>
    </td>
+   <td><strong>DeepSeek-R1-0528-MXFP4-MTP-MoEFP4(this model)</strong>
    </td>
   </tr>
   <tr>