qqc1989 commited on
Commit
c7b9cf2
Β·
verified Β·
1 Parent(s): 56b096c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +74 -140
README.md CHANGED
@@ -11,7 +11,7 @@ tags:
11
  - InternVL3-2B
12
  ---
13
 
14
- # InternVL3-2B-Int8
15
 
16
  This version of InternVL3-2B has been converted to run on the Axera NPU using **w8a16** quantization.
17
 
@@ -21,184 +21,118 @@ Compatible with Pulsar2 version: 3.4
21
 
22
  ## Convert tools links:
23
 
24
- For those who are interested in model conversion, you can try to export axmodel through the original repo:
25
- https://huggingface.co/deepseek-ai/InternVL3-2B
26
 
27
- - [Github for InternVL3-2B.axera](https://github.com/AXERA-TECH/InternVL3-2B.axera)
28
- - [Pulsar2 Link, How to Convert LLM from Huggingface to axmodel](https://pulsar2-docs.readthedocs.io/en/latest/appendix/build_llm.html)
 
 
 
29
 
30
  ## Support Platform
 
31
  - AX650
32
  - [M4N-Dock(爱芯派Pro)](https://wiki.sipeed.com/hardware/zh/maixIV/m4ndock/m4ndock.html)
 
33
 
34
  |chips|Image num|image encoder 448 | ttft | w8a16 |
35
  |--|--|--|--|--|
36
- |AX650N | 0 | 0 ms | 221 ms (128 tokens) | 11.50 tokens/sec |
37
- |AX650N | 1 | 364 ms | 862 ms (384 tokens) | 11.50 tokens/sec |
38
- |AX650N | 4 | 1456 ms | 4589 ms (1152 tokens) | 11.50 tokens/sec |
39
- |AX650N | 8 | 2912 ms | 13904 ms (2176 tokens) | 11.50 tokens/sec |
40
 
41
  ## How to use
42
 
43
  Download all files from this repository to the device.
44
 
45
- **Using AX650 Board**
46
-
47
  ```bash
48
- root@ax650 ~/yongqiang/push_hugging_face/InternVL3-2B # tree -L 1
49
  .
50
  β”œβ”€β”€ config.json
51
  β”œβ”€β”€ examples
 
 
52
  β”œβ”€β”€ infer.py
53
  β”œβ”€β”€ infer_video.py
54
  β”œβ”€β”€ internvl3_2b_axmodel
55
  β”œβ”€β”€ internvl3_2b_tokenizer
 
 
 
 
 
 
 
 
 
56
  β”œβ”€β”€ README.md
 
 
 
 
 
 
 
57
  └── vit_axmodel
58
 
59
- 4 directories, 4 files
60
- ```
61
-
62
- #### Inference with AX650 Host, such as M4N-Dock(爱芯派Pro) or AX650N DEMO Board
63
-
64
- **Text Generation**
65
-
66
- input text:
67
 
68
  ```
69
- Please calculate the derivative of the function [y=2x^ 2-2] and provide the reasoning process in markdown format.
70
- ```
71
-
72
- log information:
73
 
74
- ```bash
75
- root@ax650 ~/yongqiang/push_hugging_face/InternVL3-2B # python3 infer.py --hf_model internvl3_2b_tokenizer/ --axmodel_path internvl3_2b_axmodel/ --question "Please calculate the derivative of the function [y=2x^ 2-2] and provide the reasoning process in markdown format"
76
- Init InferenceSession: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 28/28 [00:16<00:00, 1.74it/s]
77
- model load done!
78
- prefill token_len: 85
79
- slice_indexs is [0]
80
- slice prefill done 0
81
- Decode: 9%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 232/2559 [00:19<05:14, 7.39it/s]
82
- Decode: 17%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 440/2559 [00:48<04:51, 7.26it/s]hit eos!
83
- Decode: 17%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 440/2559 [00:48<03:53, 9.06it/s]
84
- Certainly! Let's calculate the derivative of the function \( y = 2x^2 - 2 \ \ using the rules of differentiation.
85
-
86
- ### Step-by-Step Reasoning:
87
-
88
- 1. **Identify the Function:**
89
- The given function is \( y = 2x^2 - 2 \\).
90
-
91
- 2. **Differentiate Term by Term:**
92
- We will differentiate each term of the function separately.
93
 
94
- - **First Term: \( 2x^2 \ \**
95
- - The derivative of \( x^n \ \ (where n is a constant) is \( nx^{n-1} \ \).
96
- - Here, \( n = 2 \ \.
97
- - Therefore, the derivative of \( 2x^2 \ \ is \( 2 \ \ times \( 2x^{2-1} \ \ which simplifies to \( 4x \ \.
98
 
99
- - **Second Term: \( -2 \ \**
100
- - The derivative of a constant (a term without \( x \\ is 0 \.
101
- - Therefore, the derivative of \( -2 \ \ is \( 0 \.
102
 
103
- 3. **Combine the Derivatives:**
104
- - The derivative of the entire function is the sum of the derivatives of each term.
105
- - So, the derivative of \( y = 2x^2 - 2 \\ is \( 4x + 0 \\ which simplifies to \( 4x \.
106
-
107
- ### Final Answer:
108
- The derivative of the function \( y = 2x^2 - 2 \ is \( 4x \.
109
-
110
- ### Summary:
111
- The derivative of \( y = 2x^2 - 2 \ is \( 4x \.
112
  ```
113
 
114
- **Multimodal Understanding**
115
-
116
- input image
117
-
118
- ![](examples/image_1.jpg)
119
-
120
- input text:
121
 
122
  ```
123
- "Please describe this picture in detail."
124
  ```
125
 
126
- log information:
127
 
128
- ```bash
129
- root@ax650 ~/yongqiang/push_hugging_face/InternVL3-2B # python3 infer.py --hf_model internvl3_2b_tokenizer/ --axmodel_path internvl3_2b_axmodel/ --question "Please describe this picture in detail" -i examples/image_1.jpg --vit_model vit_axmodel/internvl3_2b_vit_slim.axmodel
130
- [INFO] Available providers: ['AxEngineExecutionProvider']
131
- Init InferenceSession: 0%| | 0/24 [00:00<?, ?it/s][INFO] Chip type: ChipType.MC50
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
132
  [INFO] VNPU type: VNPUType.DISABLED
133
- [INFO] Engine version: 2.11.0a
134
- Init InferenceSession: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 28/28 [00:14<00:00, 1.92it/s]
135
  model load done!
136
- prefill token_len: 325
137
- slice_indexs is [0, 1, 2]
138
- slice prefill done 0
139
- slice prefill done 1
140
- slice prefill done 2
141
- Decode: 13%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 326/2559 [00:00<00:01, 1829.15it/s]
142
- Decode: 19%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 489/2559 [00:22<02:26, 14.17it/s]hit eos!
143
- Decode: 20%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 517/2559 [00:26<01:43, 19.71it/s]
144
- **Image Description:**
145
-
146
- The image depicts a giant panda in a naturalistic enclosure, likely within a zoo or wildlife sanctuary. The panda is prominently positioned in the foreground, surrounded by lush green bamboo plants. Its distinctive black and white fur is clearly visible,
147
-
148
- with the panda's face, ears, and limbs being black, while its body and the rest of its face are white. The panda appears to be eating bamboo, with its front paws holding a piece of bamboo close to its mouth. The panda's expression is calm and curious, with its eyes looking directly at the camera.
149
-
150
- In the background, there is another panda partially obscured by the foliage and a wooden structure, possibly part of the enclosure's design. The ground is covered with a layer of mulch or wood chips, providing a naturalistic habitat for the pandas. The overall setting is serene and well-maintained,
151
-
152
- designed to mimic the panda's natural habitat while ensuring the animals' well-being.
153
- ```
154
-
155
- input video
156
-
157
- https://github.com/user-attachments/assets/2beffc73-d078-4c54-8282-7b7d845f39c9
158
-
159
- input text:
160
-
161
- ```
162
- "Please describe this video in detail."
163
  ```
164
 
165
- log information:
166
 
167
- ```bash
168
- root@ax650 ~/yongqiang/push_hugging_face/InternVL3-2B # python3 infer_video.py --hf_model internvl3_2b_tokenizer/ --axmodel_path internvl3_2b_axmodel/ --vit_model vit_axmodel/internvl3_2b_vit_slim.axmodel -i examples/red-panda.mp4 -q "Please describe this video in detail."
169
- [INFO] Available providers: ['AxEngineExecutionProvider']
170
- θΎ“ε…₯εΈ§ζ•°: 8
171
- preprocess image done!
172
- [INFO] Chip type: ChipType.MC50
173
- [INFO] VNPU type: VNPUType.DISABLED
174
- [INFO] Engine version: 2.11.0a
175
- vit_output.shape is (1, 256, 1536), vit feature extract done!
176
- Init InferenceSession: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 28/28 [00:30<00:00, 1.07s/it]
177
- model load done!
178
- prefill token_len: 2159
179
- slice_indexs is [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
180
- slice prefill done 0
181
- slice prefill done 1
182
- slice prefill done 2
183
- slice prefill done 3
184
- slice prefill done 4
185
- slice prefill done 5
186
- slice prefill done 6
187
- slice prefill done 7
188
- slice prefill done 8
189
- slice prefill done 9
190
- slice prefill done 10
191
- slice prefill done 11
192
- slice prefill done 12
193
- slice prefill done 13
194
- slice prefill done 14
195
- slice prefill done 15
196
- slice prefill done 16
197
- Decode: 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 2240/2559 [00:11<00:02, 133.83it/s]^@hit eos!
198
- Decode: 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 2303/2559 [00:21<00:02, 108.19it/s]
199
- The video features two red pandas in an outdoor enclosure with green grass and a wooden structure. One panda is perched on a branch, while the other stands on the ground.
200
- The standing panda is holding a bamboo stick with its paws, attempting to eat it. The environment appears to be a zoo or wildlife sanctuary.
201
- The lighting is natural daylight. The pandas have distinctive reddish-brown fur with black faces and white markings around their eyes.
202
- The bamboo sticks are brown and appear to be part of the enclosure's enrichment. The panda on the ground seems to be trying to reach the bamboo,
203
- while the one on the branch seems to be observing or waiting for its turn. There's no visible human interaction in these frames.
204
- ```
 
11
  - InternVL3-2B
12
  ---
13
 
14
+ # InternVL3-2B
15
 
16
  This version of InternVL3-2B has been converted to run on the Axera NPU using **w8a16** quantization.
17
 
 
21
 
22
  ## Convert tools links:
23
 
24
+ For those who are interested in model conversion, you can try to export axmodel through the original repo :
25
+ https://huggingface.co/OpenGVLab/InternVL3-2B
26
 
27
+ [How to Convert LLM from Huggingface to axmodel](https://github.com/AXERA-TECH/InternVL3-2B.axera/tree/master/model_convert)
28
+
29
+ [AXera NPU HOST LLM Runtime](https://github.com/AXERA-TECH/ax-llm/tree/ax-internvl)
30
+
31
+ [AXera NPU AXCL LLM Runtime](https://github.com/AXERA-TECH/ax-llm/tree/axcl-internvl)
32
 
33
  ## Support Platform
34
+
35
  - AX650
36
  - [M4N-Dock(爱芯派Pro)](https://wiki.sipeed.com/hardware/zh/maixIV/m4ndock/m4ndock.html)
37
+ - [M.2 Accelerator card](https://axcl-docs.readthedocs.io/zh-cn/latest/doc_guide_hardware.html)
38
 
39
  |chips|Image num|image encoder 448 | ttft | w8a16 |
40
  |--|--|--|--|--|
41
+ |AX650N | 0 | 0 ms | 221 ms (128 tokens) | 10 tokens/sec |
42
+ |AX650N | 1 | 364 ms | 862 ms (384 tokens) | 10 tokens/sec |
43
+ |AX650N | 4 | 1456 ms | 4589 ms (1152 tokens) | 10 tokens/sec |
44
+ |AX650N | 8 | 2912 ms | 13904 ms (2176 tokens) | 10 tokens/sec |
45
 
46
  ## How to use
47
 
48
  Download all files from this repository to the device.
49
 
 
 
50
  ```bash
51
+ (base) axera@raspberrypi:~/qtang/huggingface/AXERA-TECH/InternVL3-2B $ tree -L 1
52
  .
53
  β”œβ”€β”€ config.json
54
  β”œβ”€β”€ examples
55
+ β”œβ”€β”€ gradio_demo_c_api.py
56
+ β”œβ”€β”€ gradio_demo_python_api.py
57
  β”œβ”€β”€ infer.py
58
  β”œβ”€β”€ infer_video.py
59
  β”œβ”€β”€ internvl3_2b_axmodel
60
  β”œβ”€β”€ internvl3_2b_tokenizer
61
+ β”œβ”€β”€ internvl3_tokenizer.py
62
+ β”œβ”€β”€ llm.py
63
+ β”œβ”€β”€ main_api_ax650
64
+ β”œβ”€β”€ main_api_axcl_aarch64
65
+ β”œβ”€β”€ main_api_axcl_x86
66
+ β”œβ”€β”€ main_ax650
67
+ β”œβ”€β”€ main_axcl_aarch64
68
+ β”œβ”€β”€ main_axcl_x86
69
+ β”œβ”€β”€ post_config.json
70
  β”œβ”€β”€ README.md
71
+ β”œβ”€β”€ requirements.txt
72
+ β”œβ”€β”€ run_internvl_3_2b_448_api_ax650.sh
73
+ β”œβ”€β”€ run_internvl_3_2b_448_api_axcl_aarch64.sh
74
+ β”œβ”€β”€ run_internvl_3_2b_448_api_axcl_x86.sh
75
+ β”œβ”€β”€ run_internvl_3_2b_448_ax650.sh
76
+ β”œβ”€β”€ run_internvl_3_2b_448_axcl_aarch64.sh
77
+ β”œβ”€β”€ run_internvl_3_2b_448_axcl_x86.sh
78
  └── vit_axmodel
79
 
80
+ 6 directories, 22 files
 
 
 
 
 
 
 
81
 
82
  ```
 
 
 
 
83
 
84
+ ### python env requirement
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
85
 
86
+ #### pyaxengine
 
 
 
87
 
88
+ https://github.com/AXERA-TECH/pyaxengine
 
 
89
 
90
+ ```
91
+ wget https://github.com/AXERA-TECH/pyaxengine/releases/download/0.1.3.rc1/axengine-0.1.3-py3-none-any.whl
92
+ pip install axengine-0.1.3-py3-none-any.whl
 
 
 
 
 
 
93
  ```
94
 
95
+ #### others
 
 
 
 
 
 
96
 
97
  ```
98
+ pip install -r requirements.txt
99
  ```
100
 
101
+ #### Inference with Raspberry Pi 5 Host using AXCL EP(such as M.2 AI Card or HAT AI Module)
102
 
103
+ ```
104
+ cd InternVL3-2B
105
+ python gradio_demo_python_api.py --hf_model internvl3_2b_tokenizer/ \
106
+ --axmodel_path internvl3_2b_axmodel/ \
107
+ --vit_model vit_axmodel/internvl3_2b_vit_slim.axmodel
108
+
109
+ [INFO] Available providers: ['AXCLRTExecutionProvider']
110
+ Init InferenceSession: 0%| | 0/28 [00:00<?, ?it/s]
111
+ [INFO] Using provider: AXCLRTExecutionProvider
112
+ [INFO] SOC Name: AX650N
113
+ [INFO] VNPU type: VNPUType.DISABLED
114
+ [INFO] Compiler version: 3.4 162fdaa8
115
+ Init InferenceSession: 4%|β–ˆβ–ˆβ–ˆβ– | 1/28 [00:01<00:43, 1.61s/it]
116
+ [INFO] Using provider: AXCLRTExecutionProvider
117
+ ......
118
+ [INFO] VNPU type: VNPUType.DISABLED
119
+ [INFO] Compiler version: 3.4 162fdaa8
120
+ Init InferenceSession: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 28/28 [00:34<00:00, 1.23s/it]
121
+ [INFO] Using provider: AXCLRTExecutionProvider
122
+ [INFO] SOC Name: AX650N
123
  [INFO] VNPU type: VNPUType.DISABLED
124
+ [INFO] Compiler version: 3.4 162fdaa8
 
125
  model load done!
126
+ [INFO] Using provider: AXCLRTExecutionProvider
127
+ [INFO] SOC Name: AX650N
128
+ [INFO] VNPU type: VNPUType.DISABLED
129
+ [INFO] Compiler version: 3.4 162fdaa8
130
+ chatbot = gr.Chatbot(height=650)
131
+ HTTP ζœεŠ‘εœ°ε€: http://xxx.xxx.xxx.xxx:7860
132
+ * Running on local URL: http://xxx.xxx.xxx.xxx:7860
133
+ * To create a public link, set `share=True` in `launch()`.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
134
  ```
135
 
136
+ Access `http://xxx.xxx.xxx.xxx:7860` using Chrome or another browser.
137
 
138
+ ![](webgui.png)