amd
/

DeepSeek-R1-0528-MXFP4

8-bit precision

Model card Files Files and versions

linzhao-amd commited on Oct 30, 2025

Commit

fdfe6e3

·

verified ·

1 Parent(s): 07dfa3d

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ base_model:
 - **Supported Hardware Microarchitecture:** AMD MI350/MI355
 - **ROCm**: 7.0
 - **Operating System(s):** Linux
-- **Inference Engine:** [SGLang](https://docs.sglang.ai/)
 - **Model Optimizer:** [AMD-Quark](https://quark.docs.amd.com/latest/index.html) (V0.9)
   - **Weight quantization:** OCP MXFP4, Static
   - **Activation quantization:** OCP MXFP4, Dynamic
@@ -45,9 +45,9 @@ python3 quantize_quark.py --model_dir $MODEL_DIR \
 ```
 # Deployment
-### Use with SGLang
-This model can be deployed efficiently using the [SGLang](https://docs.sglang.ai/) backend.
 ## Evaluation
 The model was evaluated on AIME24, GPQA Diamond, and MATH-500 benchmarks using the [lighteval](https://github.com/huggingface/lighteval/tree/v0.10.0) framework. Each benchmark was run 10 times with different random seeds for reliable performance estimation.

 - **Supported Hardware Microarchitecture:** AMD MI350/MI355
 - **ROCm**: 7.0
 - **Operating System(s):** Linux
+- **Inference Engine:** [SGLang](https://docs.sglang.ai/)/[vLLM](https://docs.vllm.ai/en/latest/)
 - **Model Optimizer:** [AMD-Quark](https://quark.docs.amd.com/latest/index.html) (V0.9)
   - **Weight quantization:** OCP MXFP4, Static
   - **Activation quantization:** OCP MXFP4, Dynamic
 ```
 # Deployment
+This model can be deployed efficiently using the [SGLang](https://docs.sglang.ai/) and [vLLM](https://docs.vllm.ai/en/latest/) backends.
 ## Evaluation
 The model was evaluated on AIME24, GPQA Diamond, and MATH-500 benchmarks using the [lighteval](https://github.com/huggingface/lighteval/tree/v0.10.0) framework. Each benchmark was run 10 times with different random seeds for reliable performance estimation.