amd
/

DeepSeek-R1-0528-MXFP4-ASQ

8-bit precision

Model card Files Files and versions

linzhao-amd commited on Oct 30, 2025

Commit

bfa18ca

·

verified ·

1 Parent(s): e045437

Update README.md

Files changed (1) hide show

README.md +2 -3

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ base_model:
 - **Supported Hardware Microarchitecture:** AMD MI350/MI355
 - **ROCm**: 7.0
 - **Operating System(s):** Linux
-- **Inference Engine:** [SGLang](https://docs.sglang.ai/)
 - **Model Optimizer:** [AMD-Quark](https://quark.docs.amd.com/latest/index.html)
   - **Weight quantization:** OCP MXFP4, Static
   - **Activation quantization:** OCP MXFP4, Dynamic
@@ -45,9 +45,8 @@ python3 quantize_quark.py --model_dir $MODEL_DIR \
 ```
 # Deployment
-### Use with SGLang
-This model can be deployed efficiently using the [SGLang](https://docs.sglang.ai/) backend.
 ## Evaluation

 - **Supported Hardware Microarchitecture:** AMD MI350/MI355
 - **ROCm**: 7.0
 - **Operating System(s):** Linux
+- **Inference Engine:** [SGLang](https://docs.sglang.ai/)/[vLLM](https://docs.vllm.ai/en/latest/)
 - **Model Optimizer:** [AMD-Quark](https://quark.docs.amd.com/latest/index.html)
   - **Weight quantization:** OCP MXFP4, Static
   - **Activation quantization:** OCP MXFP4, Dynamic
 ```
 # Deployment
+This model can be deployed efficiently using the [SGLang](https://docs.sglang.ai/) and [vLLM](https://docs.vllm.ai/en/latest/) backends.
 ## Evaluation