Update README.md
Browse files
README.md
CHANGED
|
@@ -13,7 +13,7 @@ base_model:
|
|
| 13 |
- **Supported Hardware Microarchitecture:** AMD MI350/MI355
|
| 14 |
- **ROCm**: 7.0
|
| 15 |
- **Operating System(s):** Linux
|
| 16 |
-
- **Inference Engine:** [SGLang](https://docs.sglang.ai/)
|
| 17 |
- **Model Optimizer:** [AMD-Quark](https://quark.docs.amd.com/latest/index.html) (V0.9)
|
| 18 |
- **Weight quantization:** OCP MXFP4, Static
|
| 19 |
- **Activation quantization:** OCP MXFP4, Dynamic
|
|
@@ -45,9 +45,9 @@ python3 quantize_quark.py --model_dir $MODEL_DIR \
|
|
| 45 |
```
|
| 46 |
|
| 47 |
# Deployment
|
| 48 |
-
### Use with SGLang
|
| 49 |
|
| 50 |
-
This model can be deployed efficiently using the [SGLang](https://docs.sglang.ai/)
|
|
|
|
| 51 |
## Evaluation
|
| 52 |
|
| 53 |
The model was evaluated on AIME24, GPQA Diamond, and MATH-500 benchmarks using the [lighteval](https://github.com/huggingface/lighteval/tree/v0.10.0) framework. Each benchmark was run 10 times with different random seeds for reliable performance estimation.
|
|
|
|
| 13 |
- **Supported Hardware Microarchitecture:** AMD MI350/MI355
|
| 14 |
- **ROCm**: 7.0
|
| 15 |
- **Operating System(s):** Linux
|
| 16 |
+
- **Inference Engine:** [SGLang](https://docs.sglang.ai/)/[vLLM](https://docs.vllm.ai/en/latest/)
|
| 17 |
- **Model Optimizer:** [AMD-Quark](https://quark.docs.amd.com/latest/index.html) (V0.9)
|
| 18 |
- **Weight quantization:** OCP MXFP4, Static
|
| 19 |
- **Activation quantization:** OCP MXFP4, Dynamic
|
|
|
|
| 45 |
```
|
| 46 |
|
| 47 |
# Deployment
|
|
|
|
| 48 |
|
| 49 |
+
This model can be deployed efficiently using the [SGLang](https://docs.sglang.ai/) and [vLLM](https://docs.vllm.ai/en/latest/) backends.
|
| 50 |
+
|
| 51 |
## Evaluation
|
| 52 |
|
| 53 |
The model was evaluated on AIME24, GPQA Diamond, and MATH-500 benchmarks using the [lighteval](https://github.com/huggingface/lighteval/tree/v0.10.0) framework. Each benchmark was run 10 times with different random seeds for reliable performance estimation.
|