amd
/

DeepSeek-R1-0528-MXFP4-ASQ

8-bit precision

Model card Files Files and versions

linzhao-amd commited on Nov 6, 2025

Commit

1dbc9c0

·

verified ·

1 Parent(s): 4e94995

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -12,9 +12,11 @@ base_model:
   - **Output:** Text
 - **Supported Hardware Microarchitecture:** AMD MI350/MI355
 - **ROCm**: 7.0
 - **Operating System(s):** Linux
 - **Inference Engine:** [SGLang](https://docs.sglang.ai/)/[vLLM](https://docs.vllm.ai/en/latest/)
-- **Model Optimizer:** [AMD-Quark](https://quark.docs.amd.com/latest/index.html)
   - **Weight quantization:** OCP MXFP4, Static
   - **Activation quantization:** OCP MXFP4, Dynamic
 - **Calibration Dataset:** [Pile](https://huggingface.co/datasets/mit-han-lab/pile-val-backup)
@@ -38,7 +40,6 @@ python3 quantize_quark.py --model_dir $MODEL_DIR \
                           --quant_scheme w_mxfp4_a_mxfp4 \
                           --num_calib_data 128 \
                           --exclude_layers $exclude_layers \
-                          --skip_evaluation \
                           --multi_gpu \
                           --quant_algo autosmoothquant \
                           --model_export hf_format \

   - **Output:** Text
 - **Supported Hardware Microarchitecture:** AMD MI350/MI355
 - **ROCm**: 7.0
+- **PyTorch**: 2.8.0
+- **Transformers**: 4.53.0
 - **Operating System(s):** Linux
 - **Inference Engine:** [SGLang](https://docs.sglang.ai/)/[vLLM](https://docs.vllm.ai/en/latest/)
+- **Model Optimizer:** [AMD-Quark](https://quark.docs.amd.com/latest/index.html) (V0.10)
   - **Weight quantization:** OCP MXFP4, Static
   - **Activation quantization:** OCP MXFP4, Dynamic
 - **Calibration Dataset:** [Pile](https://huggingface.co/datasets/mit-han-lab/pile-val-backup)
                           --quant_scheme w_mxfp4_a_mxfp4 \
                           --num_calib_data 128 \
                           --exclude_layers $exclude_layers \
                           --multi_gpu \
                           --quant_algo autosmoothquant \
                           --model_export hf_format \