amd
/

Mistral-7B-Instruct-v0.1-awq-asym-uint4-g128-lmhead-onnx-hybrid

Model card Files Files and versions

uday610 commited on Feb 16

Commit

e89a8dc

·

verified ·

1 Parent(s): 8c41449

Create README.md

Files changed (1) hide show

README.md +46 -0

README.md ADDED Viewed

	@@ -0,0 +1,46 @@

+---
+base_model:
+- mistralai/Mistral-7B-Instruct-v0.1
+---
+Quark 0.6.0
+```
+python3 quantize_quark.py
+   --model_dir "meta-llama/Llama-2-7b-chat-hf"
+   --output_dir <quantized safetensor output dir>
+   --quant_scheme w_uint4_per_group_asym
+   --num_calib_data 128
+   --quant_algo awq
+   --dataset pileval_for_awq_benchmark
+   --seq_len 512
+   --model_export quark_safetensors
+   --data_type float16
+   --exclude_layers []
+   --custom_mode awq
+```
+Model Builder v0.5.1
+```
+cd onnxruntime-genai/src/python/py/models
+python builder.py \
+   -i <quantized safetensor model dir> \
+   -o <oga model output dir> \
+   -p int4 \
+   -e dml
+```
+Hybrid Package: https://gitenterprise.xilinx.com/VitisAI/hybrid-llm/actions/runs/643176
+Performance
+-
+HP OmniBook Ultra Laptop 14-fd0xxx
+AMD Ryzen AI 9 365 w/ Radeon 880M (Performance Mode)
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/65ff7616871b36bf84150cda/hkHF25WNdzsZQtC0S2HH3.png)