uday610 commited on
Commit
e89a8dc
·
verified ·
1 Parent(s): 8c41449

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -0
README.md ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - mistralai/Mistral-7B-Instruct-v0.1
4
+ ---
5
+
6
+
7
+ Quark 0.6.0
8
+
9
+ ```
10
+ python3 quantize_quark.py
11
+ --model_dir "meta-llama/Llama-2-7b-chat-hf"
12
+ --output_dir <quantized safetensor output dir>
13
+ --quant_scheme w_uint4_per_group_asym
14
+ --num_calib_data 128
15
+ --quant_algo awq
16
+ --dataset pileval_for_awq_benchmark
17
+ --seq_len 512
18
+ --model_export quark_safetensors
19
+ --data_type float16
20
+ --exclude_layers []
21
+ --custom_mode awq
22
+ ```
23
+
24
+ Model Builder v0.5.1
25
+
26
+ ```
27
+ cd onnxruntime-genai/src/python/py/models
28
+
29
+ python builder.py \
30
+ -i <quantized safetensor model dir> \
31
+ -o <oga model output dir> \
32
+ -p int4 \
33
+ -e dml
34
+ ```
35
+
36
+ Hybrid Package: https://gitenterprise.xilinx.com/VitisAI/hybrid-llm/actions/runs/643176
37
+
38
+ Performance
39
+ -
40
+
41
+ HP OmniBook Ultra Laptop 14-fd0xxx
42
+
43
+ AMD Ryzen AI 9 365 w/ Radeon 880M (Performance Mode)
44
+
45
+
46
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65ff7616871b36bf84150cda/hkHF25WNdzsZQtC0S2HH3.png)