File size: 4,085 Bytes
afa8362
 
 
 
d5f9fc5
afa8362
 
d5f9fc5
afa8362
 
 
 
 
 
 
d5f9fc5
afa8362
d5f9fc5
afa8362
 
 
d5f9fc5
 
 
afa8362
 
 
 
 
d5f9fc5
afa8362
d5f9fc5
afa8362
 
 
 
 
 
 
 
 
 
 
 
 
 
4fa9bf6
afa8362
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
---
library_name: transformers
license: bsd-3-clause
base_model:
- OpenGVLab/InternVL3_5-2B
tags:
- InternVL3
- InternVL3_5-2B
- Int8
- VLM
pipeline_tag: image-text-to-text
language:
- en
---

# InternVL3_5-2B

This version of InternVL3_5-2B has been converted to run on the Axera NPU using **w8a16** quantization.

This model has been optimized with the following LoRA: 

Compatible with Pulsar2 version: 5.1-patch1.

Please note that the context of the model is 2k and the maximum prefill length is 1k.

## Convert tools links:

For those who are interested in model conversion, you can try to export axmodel through the original repo:

https://huggingface.co/OpenGVLab/InternVL3_5-2B

[How to Convert LLM from Huggingface to axmodel](https://github.com/AXERA-TECH/InternVL3_5-2B.axera/tree/main/model_convert)

[AXera NPU HOST LLM Runtime](https://github.com/AXERA-TECH/ax-llm/tree/ax-internvl) 

[AXera NPU AXCL LLM Runtime](https://github.com/AXERA-TECH/ax-llm/tree/axcl-internvl)

## Support Platform

- AX650
  - AX650N DEMO Board
  - [M4N-Dock(爱芯派Pro)](https://wiki.sipeed.com/hardware/zh/maixIV/m4ndock/m4ndock.html)
  - [M.2 Accelerator card](https://axcl-docs.readthedocs.io/zh-cn/latest/doc_guide_hardware.html)
 
|Chips|image encoder 448|ttft|w8a16|
|--|--|--|--|
|AX650| 364.412 ms | 5844 ms | 9.52 tokens/sec|


## How to use

Download all files from this repository to the device

```
$ tree -L 1
.
├── assets
├── config.json
├── examples
├── gradio_demo.py
├── infer_axmodel.py
├── infer_torch.py
├── internvl3-5_axmodel
├── internvl3-5_tokenizer
├── README.md
├── utils
└── vit-models

6 directories, 5 files
```

#### Install transformer

```
pip install transformers==4.57.1
```

#### Inference with AX650 Host, such as M4N-Dock(爱芯派Pro) or AX650 DEMO Board

Interactive conversations using the `Gradio API`:

```bash
$ python3 gradio_demo.py --hf_model internvl3-5_tokenizer/ --axmodel_path internvl3-5_axmodel/ --vit_model vit-models/internvl_vit_model_1x3x448x448.axmodel
```

Plain text dialogue:

![demo_1](assets/demo_1.png)

Image understanding:

![demo_2](assets/demo_2.png)

---

Run the following command on the Axera board to start a chat conversation:

```sh
$ python3 infer_axmodel.py --hf_model internvl3-5_tokenizer/ --axmodel_path internvl3-5_axmodel/ --question "请计算函数[y=2x^2+2]的导数, 并提供 markdown 格式的推理过程"
```

output:

```bash
[INFO] Using provider: AxEngineExecutionProvider
[INFO] Model type: 2 (triple core)
[INFO] Compiler version: 5.1-dirty 0fdbfe15-dirty
Model loaded successfully!
slice_indices: [0]
Slice prefill done: 0
answer >> 函数 \( y = 2x^2 + 2 \) 的导数可以通过求导法则来计算。首先,我们对函数中的每一项分别求导:

1. 对于 \( 2x^2 \),使用幂法则求导:
   \[
   \frac{d}{dx}(2x^2) = 2 \cdot 2x = 4x
   \]

2. 对于常数项 \( 2 \),其导数为 0,因为常数的导数为 0。

将这两部分的结果相加,得到函数 \( y \) 的导数:
\[
y' = 4x
\]

因此,函数 \( y = 2x^2 + 2 \) 的导数为 \( y' = 4x \)。
```

Enter the following command to perform the single-image understanding task:

```sh
$ python3 infer_axmodel.py --hf_model internvl3-5_tokenizer/ --axmodel_path internvl3-5_axmodel/ --question "请描述这幅图" -i examples/image_0.jpg --vit_model vit-models/internvl_vit_model_1x3x448x448.axmodel
```

![image_0.jpg](examples/image_0.jpg)

output:

```bash
[INFO] Model type: 2 (triple core)
[INFO] Compiler version: 5.1-dirty 0fdbfe15-dirty
Model loaded successfully!
slice_indices: [0, 1, 2]
Slice prefill done: 0
Slice prefill done: 1
Slice prefill done: 2
answer >> 这是一张红熊猫的照片。红熊猫是一种红棕色的哺乳动物,通常生活在亚洲的森林中。它们以捕食昆虫和小型无脊椎动物为生。图片中,红熊猫正坐在一个木制的平台上,背景是绿色的树木和植被,显得非常自然和生动。红熊猫的表情看起来很友好,似乎在观察或等待什么。
```