qqc1989 commited on
Commit
12c311a
·
verified ·
1 Parent(s): 396eb35

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -3
README.md CHANGED
@@ -1,3 +1,33 @@
1
- ---
2
- license: bsd-3-clause
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: bsd-3-clause
4
+ ---
5
+
6
+ # DeepSeek-R1-Distill-Qwen-7B-GPTQ-Int4
7
+
8
+ This version of DeepSeek-R1-Distill-Qwen-7B has been converted to run on the Axera NPU using **w4a16** quantization.
9
+
10
+ This model has been optimized with the following LoRA:
11
+
12
+ Compatible with Pulsar2 version: 3.4(Not released yet)
13
+
14
+ ## Convert tools links:
15
+
16
+ For those who are interested in model conversion, you can try to export axmodel through the original repo : https://huggingface.co/jakiAJK/DeepSeek-R1-Distill-Qwen-7B_GPTQ-int4
17
+
18
+ [Pulsar2 Link, How to Convert LLM from Huggingface to axmodel](https://pulsar2-docs.readthedocs.io/en/latest/appendix/build_llm.html)
19
+
20
+ [AXera NPU LLM Runtime](https://github.com/AXERA-TECH/ax-llm)
21
+
22
+ ## Support Platform
23
+
24
+ - AX650
25
+ - AX650N DEMO Board
26
+ - [M4N-Dock(爱芯派Pro)](https://wiki.sipeed.com/hardware/zh/maixIV/m4ndock/m4ndock.html)
27
+ - [M.2 Accelerator card](https://axcl-docs.readthedocs.io/zh-cn/latest/doc_guide_hardware.html)
28
+ - AX630C
29
+ - *developing*
30
+
31
+ |Chips|w8a16|w4a16|
32
+ |--|--|--|
33
+ |AX650| 2.7 tokens/sec|5 tokens/sec|