Update README.md
Browse files
README.md
CHANGED
|
@@ -12,9 +12,11 @@ base_model:
|
|
| 12 |
- **Output:** Text
|
| 13 |
- **Supported Hardware Microarchitecture:** AMD MI350/MI355
|
| 14 |
- **ROCm**: 7.0
|
|
|
|
|
|
|
| 15 |
- **Operating System(s):** Linux
|
| 16 |
- **Inference Engine:** [SGLang](https://docs.sglang.ai/)/[vLLM](https://docs.vllm.ai/en/latest/)
|
| 17 |
-
- **Model Optimizer:** [AMD-Quark](https://quark.docs.amd.com/latest/index.html)
|
| 18 |
- **Weight quantization:** OCP MXFP4, Static
|
| 19 |
- **Activation quantization:** OCP MXFP4, Dynamic
|
| 20 |
- **Calibration Dataset:** [Pile](https://huggingface.co/datasets/mit-han-lab/pile-val-backup)
|
|
@@ -38,7 +40,6 @@ python3 quantize_quark.py --model_dir $MODEL_DIR \
|
|
| 38 |
--quant_scheme w_mxfp4_a_mxfp4 \
|
| 39 |
--num_calib_data 128 \
|
| 40 |
--exclude_layers $exclude_layers \
|
| 41 |
-
--skip_evaluation \
|
| 42 |
--multi_gpu \
|
| 43 |
--quant_algo autosmoothquant \
|
| 44 |
--model_export hf_format \
|
|
|
|
| 12 |
- **Output:** Text
|
| 13 |
- **Supported Hardware Microarchitecture:** AMD MI350/MI355
|
| 14 |
- **ROCm**: 7.0
|
| 15 |
+
- **PyTorch**: 2.8.0
|
| 16 |
+
- **Transformers**: 4.53.0
|
| 17 |
- **Operating System(s):** Linux
|
| 18 |
- **Inference Engine:** [SGLang](https://docs.sglang.ai/)/[vLLM](https://docs.vllm.ai/en/latest/)
|
| 19 |
+
- **Model Optimizer:** [AMD-Quark](https://quark.docs.amd.com/latest/index.html) (V0.10)
|
| 20 |
- **Weight quantization:** OCP MXFP4, Static
|
| 21 |
- **Activation quantization:** OCP MXFP4, Dynamic
|
| 22 |
- **Calibration Dataset:** [Pile](https://huggingface.co/datasets/mit-han-lab/pile-val-backup)
|
|
|
|
| 40 |
--quant_scheme w_mxfp4_a_mxfp4 \
|
| 41 |
--num_calib_data 128 \
|
| 42 |
--exclude_layers $exclude_layers \
|
|
|
|
| 43 |
--multi_gpu \
|
| 44 |
--quant_algo autosmoothquant \
|
| 45 |
--model_export hf_format \
|