linzhao-amd commited on
Commit
fdfe6e3
·
verified ·
1 Parent(s): 07dfa3d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -13,7 +13,7 @@ base_model:
13
  - **Supported Hardware Microarchitecture:** AMD MI350/MI355
14
  - **ROCm**: 7.0
15
  - **Operating System(s):** Linux
16
- - **Inference Engine:** [SGLang](https://docs.sglang.ai/)
17
  - **Model Optimizer:** [AMD-Quark](https://quark.docs.amd.com/latest/index.html) (V0.9)
18
  - **Weight quantization:** OCP MXFP4, Static
19
  - **Activation quantization:** OCP MXFP4, Dynamic
@@ -45,9 +45,9 @@ python3 quantize_quark.py --model_dir $MODEL_DIR \
45
  ```
46
 
47
  # Deployment
48
- ### Use with SGLang
49
 
50
- This model can be deployed efficiently using the [SGLang](https://docs.sglang.ai/) backend.
 
51
  ## Evaluation
52
 
53
  The model was evaluated on AIME24, GPQA Diamond, and MATH-500 benchmarks using the [lighteval](https://github.com/huggingface/lighteval/tree/v0.10.0) framework. Each benchmark was run 10 times with different random seeds for reliable performance estimation.
 
13
  - **Supported Hardware Microarchitecture:** AMD MI350/MI355
14
  - **ROCm**: 7.0
15
  - **Operating System(s):** Linux
16
+ - **Inference Engine:** [SGLang](https://docs.sglang.ai/)/[vLLM](https://docs.vllm.ai/en/latest/)
17
  - **Model Optimizer:** [AMD-Quark](https://quark.docs.amd.com/latest/index.html) (V0.9)
18
  - **Weight quantization:** OCP MXFP4, Static
19
  - **Activation quantization:** OCP MXFP4, Dynamic
 
45
  ```
46
 
47
  # Deployment
 
48
 
49
+ This model can be deployed efficiently using the [SGLang](https://docs.sglang.ai/) and [vLLM](https://docs.vllm.ai/en/latest/) backends.
50
+
51
  ## Evaluation
52
 
53
  The model was evaluated on AIME24, GPQA Diamond, and MATH-500 benchmarks using the [lighteval](https://github.com/huggingface/lighteval/tree/v0.10.0) framework. Each benchmark was run 10 times with different random seeds for reliable performance estimation.