Qwen2.5-1.5B-Instruct-python / README.md

wli1995

Update README.md

aa999d9 verified about 2 months ago

5.61 kB

	---
	license: mit
	language:
	- zh
	- en
	base_model:
	- Qwen/Qwen2.5-1.5B-Instruct-GPTQ-INT8
	- Qwen/Qwen2.5-1.5B-Instruct-GPTQ-INT4
	pipeline_tag: text-generation
	library_name: transformers
	tags:
	- Context
	- Qwen2.5-1.5B-Instruct-GPTQ-INT8
	- Qwen2.5-1.5B-Instruct-GPTQ-INT4
	---

	# Qwen2.5-1.5B-Instruct-python

	This version of Qwen2.5-1.5B-Instruct-python has been converted to run on the Axera NPU using w8a16 and w4a16 quantization.

	This model has been optimized with the following LoRA:

	Compatible with Pulsar2 version: 4.1

	## Feature

	- Support for longer contexts, in this sample it's 2.5k
	- Support context dialogue
	- System prompt kvcache is supported

	## Convert tools links:

	For those who are interested in model conversion, you can try to export axmodel through the original repo : https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct-GPTQ-Int8

	[Pulsar2 Link, How to Convert LLM from Huggingface to axmodel](https://pulsar2-docs.readthedocs.io/en/latest/appendix/build_llm.html)

	[AXera NPU AXEngine LLM Runtime](https://github.com/AXERA-TECH/ax-llm/tree/ax-context)

	[AXera NPU AXCL LLM Runtime](https://github.com/AXERA-TECH/ax-llm/tree/axcl-context)

	### Convert script

	The follow show how to convert Qwen2.5-1.5B-Instruct-GPTQ-Int8

	```
	pulsar2 llm_build --input_path Qwen/Qwen2.5-1.5B-Instruct-GPTQ-Int8 \
	--output_path Qwen/Qwen2.5-1.5B-Instruct-GPTQ-Int8-ctx-ax650 \
	--hidden_state_type bf16 --kv_cache_len 2047 --prefill_len 128 \
	--last_kv_cache_len 128 \
	--last_kv_cache_len 256 \
	--last_kv_cache_len 384 \
	--last_kv_cache_len 512 \
	--last_kv_cache_len 640 \
	--last_kv_cache_len 768 \
	--last_kv_cache_len 896 \
	--last_kv_cache_len 1024 \
	--chip AX650 -c 1 --parallel 8
	```

	## Support Platform

	- AX650
	- AX650N DEMO Board
	- [M4N-Dock(爱芯派Pro)](https://wiki.sipeed.com/hardware/zh/maixIV/m4ndock/m4ndock.html)
	- [M.2 Accelerator card](https://axcl-docs.readthedocs.io/zh-cn/latest/doc_guide_hardware.html)
	- AX630C
	- TBD

	## How to use

	Download all files from this repository to the device

	```
	root@ax650:/mnt/qtang/llm-test/Qwen2.5-1.5B-Instruct-python# tree -L 1
	.
	├── chat.py
	├── infer.py
	├── infer_torch.py
	├── Qwen2.5-1.5B-Instruct-GPTQ-Int8
	├── Qwen2.5-1.5B-Instruct-GPTQ-Int8_axmodel
	└── README.md

	2 directories, 4 files
	```

	在 `AXERA 650N` 开发板上使用 `python api` 进行模型推理.

	在当前目录执行以下命令:

	```sh
	$ python3 chat.py
	```

	当出现 `prompt (输入 q 退出对话) >>` 提示时输入文字, 等待模型输出, 具体示例如下:

	```bash
	$ python3 chat.py
	...
	The models have been loaded!
	2025-07-21 14:23:46.137 \| DEBUG \| __main__:<module>:143 - >>> 创建 LlamaChatSession >>>
	>>> 系统提示: 你的名字叫小智(allen), 你是一个人畜无害的 AI 助手. 深圳市今天(4月1日)阴天, 愚人节, 气温在 14°C 至 19°C
	之间, 微风.
	2025-07-21 14:23:46.137 \| INFO \| __main__:chat_loop:69 - Type 'q' to exit, Ctrl+c to stop current generation

	prompt (输入 q 退出对话) >> 定义函数y=3x^3+2x+1,求解它的导数.
	answer: >> 要找到函数 $ y = 3x^3 + 2x + 1 $ 的导数，我们需要对每个项分别求导，然后将它们相加起来。

	1. 对 $ 3x^3 $ ��导，结果是 $ 3 \cdot 3x^{3-1} = 9x^2 $。
	2. 对 $ 2x $ ��导，结果是 $ 2 \cdot 1x^{1-1} = 2 $。
	3. 对常数项 $ 1 $ ��导，结果是 $ 0 $。

	将这些结果相加，我们得到：

	\[ y' = 9x^2 + 2 \]

	所以，函数 $ y = 3x^3 + 2x + 1 $ 的导数是 $ y' = 9x^2 + 2 $。

	prompt (输入 q 退出对话) >> 这个函数中自变量和因变量分别是什么?
	answer: >> 在数学中，函数通常由两个变量组成：自变量（也称为输入变量）和因变量（也称为输出变量）。自变量是函数中的一个
	量，它的值决定了因变量的值。

	在你提供的函数 $ y = 3x^3 + 2x + 1 $ 中：

	- $ x $ 是自变量。
	- $ y $ 是因变量。

	自变量 $ x $ 的值决定了因变量 $ y $ 的值。例如，如果你给 $ x $ ��值为 2，那么 $ y $ ��等于 \( 3(2)^3 + 2(2) +
	1 = 24 + 4 + 1 = 29 \)。

	因此，这个函数描述了一个关于 $ x $ 和 $ y $ 的关系，其中 $ x $ 是自变量，而 $ y $ 是因变量。通过改变 $ x $
	值，你可以计算出相应的 $ y $ ��。

	prompt (输入 q 退出对话) >> 这个函数中最高幂次和最低幂次分别是多少?
	answer: >> 在函数 $ y = 3x^3 + 2x + 1 $ 中，最高次幂（最高幂次）是 $ x^3 $，因此最高幂次是 3。

	最低次幂（最低幂次）是 $ x^0 $，因为 $ x^0 = 1 $ 对于任何 $ x $ ��成立，所以最低幂次是 0。

	因此，这个函数的最高幂次是 3，最低幂次是 0。最高幂次和最低幂次的差值是 $ 3 - 0 = 3 $。这意味着函数的图形是一个三次多
	式，它有一个顶点（如果最高幂次是偶数）或一个拐点（如果最高幂次是奇数）。在这个例子中，由于最高幂次是奇数，函数的图形
	有一个拐点。

	```

	当上下文窗口达到上限, 可以输入 `reset` 命令重置, 例如:

	```sh
	prompt (输入 q 退出对话) >> reset
	上下文已重置
	prompt (输入 q 退出对话) >> 你是谁?今天天气如何?
	answer: >> 我是小智,一名人工智能助手。今天是阴天,愚人节,气温在14°C至19°C之间,微风。
	```