BoruiXu commited on
Commit
8c06768
·
verified ·
1 Parent(s): 55a580f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -1
README.md CHANGED
@@ -1,4 +1,9 @@
1
  run phi3-mini on AMD NPU
2
 
 
 
3
 
4
- reference:https://github.com/amd/RyzenAI-SW/tree/main/example/transformers
 
 
 
 
1
  run phi3-mini on AMD NPU
2
 
3
+ 1. Use awq quantization for the original model(get ```phi3_mini_awq_4bit_no_flash_attention.pt```).
4
+ 2. run ```python run_awq.py --task decode --target aie --w_bit 4```
5
 
6
+
7
+ reference:https://github.com/amd/RyzenAI-SW/tree/main/example/transformers
8
+
9
+ As the quantization of phi-3, may refer to https://github.com/mit-han-lab/llm-awq/pull/183