amd
/

Kimi-K2-Thinking-MXFP4

8-bit precision

Model card Files Files and versions

jiaxwang commited on Jan 19

Commit

189a20b

·

verified ·

1 Parent(s): 636197a

Update README.md

Files changed (1) hide show

README.md +7 -1

README.md CHANGED Viewed

@@ -12,7 +12,7 @@ base_model:
   - **Input:** Text
   - **Output:** Text
 - **Supported Hardware Microarchitecture:** AMD MI350/MI355
-- **ROCm**: 7.0
 - **Operating System(s):** Linux
 - **Inference Engine:** [vLLM](https://docs.vllm.ai/en/latest/)
 - **Model Optimizer:** [AMD-Quark](https://quark.docs.amd.com/latest/index.html)
@@ -77,6 +77,12 @@ The model was evaluated on GSM8K benchmarks.
 ### Reproduction
 The GSM8K results were obtained using the [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness.git) framework, based on the Docker image `rocm/vllm-dev:base`, with vLLM and lm-eval compiled and installed from source inside the container.
 #### Launching server
 ```
 export VLLM_ATTENTION_BACKEND="TRITON_MLA"

   - **Input:** Text
   - **Output:** Text
 - **Supported Hardware Microarchitecture:** AMD MI350/MI355
+- **ROCm:** 7.0
 - **Operating System(s):** Linux
 - **Inference Engine:** [vLLM](https://docs.vllm.ai/en/latest/)
 - **Model Optimizer:** [AMD-Quark](https://quark.docs.amd.com/latest/index.html)
 ### Reproduction
 The GSM8K results were obtained using the [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness.git) framework, based on the Docker image `rocm/vllm-dev:base`, with vLLM and lm-eval compiled and installed from source inside the container.
+#### Commit Hash:
+- **vLLM:** [cbbae38f9368b6c35d9b9295bf4ceee1e6452750](https://github.com/vllm-project/vllm/commit/cbbae38f9368b6c35d9b9295bf4ceee1e6452750)
+- **Quark:** [1742e8c40f8b90c4ecb2b086788160e919986399](https://gitenterprise.xilinx.com/AMDNeuralOpt/Quark/commit/1742e8c40f8b90c4ecb2b086788160e919986399)
+- **lm-evaluation-harness:** [4b74ec1268267ea2ea83893400d7013df30507af](https://github.com/EleutherAI/lm-evaluation-harness/commit/4b74ec1268267ea2ea83893400d7013df30507af)
+  - tips: Remove the 'hf' and 'vllm' optional dependency lines from pyproject.toml in the lm-evaluation-harness repository to prevent installation errors.
 #### Launching server
 ```
 export VLLM_ATTENTION_BACKEND="TRITON_MLA"