AXERA-TECH
/

DeepSeek-R1-Distill-Qwen-7B-GPTQ-Int4

DeepSeek-R1-Distill-Qwen-7B

DeepSeek-R1-Distill-Qwen-7B-GPTQ-int4

Model card Files Files and versions

qqc1989 commited on Feb 13, 2025

Commit

12c311a

·

verified ·

1 Parent(s): 396eb35

Update README.md

Files changed (1) hide show

README.md +33 -3

README.md CHANGED Viewed

@@ -1,3 +1,33 @@
----
-license: bsd-3-clause
----

+---
+library_name: transformers
+license: bsd-3-clause
+---
+# DeepSeek-R1-Distill-Qwen-7B-GPTQ-Int4
+This version of DeepSeek-R1-Distill-Qwen-7B has been converted to run on the Axera NPU using **w4a16** quantization.
+This model has been optimized with the following LoRA:
+Compatible with Pulsar2 version: 3.4(Not released yet)
+## Convert tools links:
+For those who are interested in model conversion, you can try to export axmodel through the original repo : https://huggingface.co/jakiAJK/DeepSeek-R1-Distill-Qwen-7B_GPTQ-int4
+[Pulsar2 Link, How to Convert LLM from Huggingface to axmodel](https://pulsar2-docs.readthedocs.io/en/latest/appendix/build_llm.html)
+[AXera NPU LLM Runtime](https://github.com/AXERA-TECH/ax-llm)
+## Support Platform
+- AX650
+  - AX650N DEMO Board
+  - [M4N-Dock(爱芯派Pro)](https://wiki.sipeed.com/hardware/zh/maixIV/m4ndock/m4ndock.html)
+  - [M.2 Accelerator card](https://axcl-docs.readthedocs.io/zh-cn/latest/doc_guide_hardware.html)
+- AX630C
+  - *developing*
+|Chips|w8a16|w4a16|
+|--|--|--|
+|AX650| 2.7 tokens/sec|5 tokens/sec|