Update README.md
Browse files
README.md
CHANGED
|
@@ -11,7 +11,7 @@ tags:
|
|
| 11 |
- InternVL3-2B
|
| 12 |
---
|
| 13 |
|
| 14 |
-
# InternVL3-2B
|
| 15 |
|
| 16 |
This version of InternVL3-2B has been converted to run on the Axera NPU using **w8a16** quantization.
|
| 17 |
|
|
@@ -21,184 +21,118 @@ Compatible with Pulsar2 version: 3.4
|
|
| 21 |
|
| 22 |
## Convert tools links:
|
| 23 |
|
| 24 |
-
For those who are interested in model conversion, you can try to export axmodel through the original repo:
|
| 25 |
-
https://huggingface.co/
|
| 26 |
|
| 27 |
-
|
| 28 |
-
|
|
|
|
|
|
|
|
|
|
| 29 |
|
| 30 |
## Support Platform
|
|
|
|
| 31 |
- AX650
|
| 32 |
- [M4N-Dock(η±θ―ζ΄ΎPro)](https://wiki.sipeed.com/hardware/zh/maixIV/m4ndock/m4ndock.html)
|
|
|
|
| 33 |
|
| 34 |
|chips|Image num|image encoder 448 | ttft | w8a16 |
|
| 35 |
|--|--|--|--|--|
|
| 36 |
-
|AX650N | 0 | 0 ms | 221 ms (128 tokens) |
|
| 37 |
-
|AX650N | 1 | 364 ms | 862 ms (384 tokens) |
|
| 38 |
-
|AX650N | 4 | 1456 ms | 4589 ms (1152 tokens) |
|
| 39 |
-
|AX650N | 8 | 2912 ms | 13904 ms (2176 tokens) |
|
| 40 |
|
| 41 |
## How to use
|
| 42 |
|
| 43 |
Download all files from this repository to the device.
|
| 44 |
|
| 45 |
-
**Using AX650 Board**
|
| 46 |
-
|
| 47 |
```bash
|
| 48 |
-
|
| 49 |
.
|
| 50 |
βββ config.json
|
| 51 |
βββ examples
|
|
|
|
|
|
|
| 52 |
βββ infer.py
|
| 53 |
βββ infer_video.py
|
| 54 |
βββ internvl3_2b_axmodel
|
| 55 |
βββ internvl3_2b_tokenizer
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 56 |
βββ README.md
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 57 |
βββ vit_axmodel
|
| 58 |
|
| 59 |
-
|
| 60 |
-
```
|
| 61 |
-
|
| 62 |
-
#### Inference with AX650 Host, such as M4N-Dock(η±θ―ζ΄ΎPro) or AX650N DEMO Board
|
| 63 |
-
|
| 64 |
-
**Text Generation**
|
| 65 |
-
|
| 66 |
-
input text:
|
| 67 |
|
| 68 |
```
|
| 69 |
-
Please calculate the derivative of the function [y=2x^ 2-2] and provide the reasoning process in markdown format.
|
| 70 |
-
```
|
| 71 |
-
|
| 72 |
-
log information:
|
| 73 |
|
| 74 |
-
|
| 75 |
-
root@ax650 ~/yongqiang/push_hugging_face/InternVL3-2B # python3 infer.py --hf_model internvl3_2b_tokenizer/ --axmodel_path internvl3_2b_axmodel/ --question "Please calculate the derivative of the function [y=2x^ 2-2] and provide the reasoning process in markdown format"
|
| 76 |
-
Init InferenceSession: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 28/28 [00:16<00:00, 1.74it/s]
|
| 77 |
-
model load done!
|
| 78 |
-
prefill token_len: 85
|
| 79 |
-
slice_indexs is [0]
|
| 80 |
-
slice prefill done 0
|
| 81 |
-
Decode: 9%|βββββββ | 232/2559 [00:19<05:14, 7.39it/s]
|
| 82 |
-
Decode: 17%|ββββββββββββ | 440/2559 [00:48<04:51, 7.26it/s]hit eos!
|
| 83 |
-
Decode: 17%|ββββββββββββ | 440/2559 [00:48<03:53, 9.06it/s]
|
| 84 |
-
Certainly! Let's calculate the derivative of the function \( y = 2x^2 - 2 \ \ using the rules of differentiation.
|
| 85 |
-
|
| 86 |
-
### Step-by-Step Reasoning:
|
| 87 |
-
|
| 88 |
-
1. **Identify the Function:**
|
| 89 |
-
The given function is \( y = 2x^2 - 2 \\).
|
| 90 |
-
|
| 91 |
-
2. **Differentiate Term by Term:**
|
| 92 |
-
We will differentiate each term of the function separately.
|
| 93 |
|
| 94 |
-
|
| 95 |
-
- The derivative of \( x^n \ \ (where n is a constant) is \( nx^{n-1} \ \).
|
| 96 |
-
- Here, \( n = 2 \ \.
|
| 97 |
-
- Therefore, the derivative of \( 2x^2 \ \ is \( 2 \ \ times \( 2x^{2-1} \ \ which simplifies to \( 4x \ \.
|
| 98 |
|
| 99 |
-
|
| 100 |
-
- The derivative of a constant (a term without \( x \\ is 0 \.
|
| 101 |
-
- Therefore, the derivative of \( -2 \ \ is \( 0 \.
|
| 102 |
|
| 103 |
-
|
| 104 |
-
|
| 105 |
-
|
| 106 |
-
|
| 107 |
-
### Final Answer:
|
| 108 |
-
The derivative of the function \( y = 2x^2 - 2 \ is \( 4x \.
|
| 109 |
-
|
| 110 |
-
### Summary:
|
| 111 |
-
The derivative of \( y = 2x^2 - 2 \ is \( 4x \.
|
| 112 |
```
|
| 113 |
|
| 114 |
-
|
| 115 |
-
|
| 116 |
-
input image
|
| 117 |
-
|
| 118 |
-

|
| 119 |
-
|
| 120 |
-
input text:
|
| 121 |
|
| 122 |
```
|
| 123 |
-
|
| 124 |
```
|
| 125 |
|
| 126 |
-
|
| 127 |
|
| 128 |
-
```
|
| 129 |
-
|
| 130 |
-
|
| 131 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 132 |
[INFO] VNPU type: VNPUType.DISABLED
|
| 133 |
-
[INFO]
|
| 134 |
-
Init InferenceSession: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 28/28 [00:14<00:00, 1.92it/s]
|
| 135 |
model load done!
|
| 136 |
-
|
| 137 |
-
|
| 138 |
-
|
| 139 |
-
|
| 140 |
-
|
| 141 |
-
|
| 142 |
-
|
| 143 |
-
|
| 144 |
-
**Image Description:**
|
| 145 |
-
|
| 146 |
-
The image depicts a giant panda in a naturalistic enclosure, likely within a zoo or wildlife sanctuary. The panda is prominently positioned in the foreground, surrounded by lush green bamboo plants. Its distinctive black and white fur is clearly visible,
|
| 147 |
-
|
| 148 |
-
with the panda's face, ears, and limbs being black, while its body and the rest of its face are white. The panda appears to be eating bamboo, with its front paws holding a piece of bamboo close to its mouth. The panda's expression is calm and curious, with its eyes looking directly at the camera.
|
| 149 |
-
|
| 150 |
-
In the background, there is another panda partially obscured by the foliage and a wooden structure, possibly part of the enclosure's design. The ground is covered with a layer of mulch or wood chips, providing a naturalistic habitat for the pandas. The overall setting is serene and well-maintained,
|
| 151 |
-
|
| 152 |
-
designed to mimic the panda's natural habitat while ensuring the animals' well-being.
|
| 153 |
-
```
|
| 154 |
-
|
| 155 |
-
input video
|
| 156 |
-
|
| 157 |
-
https://github.com/user-attachments/assets/2beffc73-d078-4c54-8282-7b7d845f39c9
|
| 158 |
-
|
| 159 |
-
input text:
|
| 160 |
-
|
| 161 |
-
```
|
| 162 |
-
"Please describe this video in detail."
|
| 163 |
```
|
| 164 |
|
| 165 |
-
|
| 166 |
|
| 167 |
-
|
| 168 |
-
root@ax650 ~/yongqiang/push_hugging_face/InternVL3-2B # python3 infer_video.py --hf_model internvl3_2b_tokenizer/ --axmodel_path internvl3_2b_axmodel/ --vit_model vit_axmodel/internvl3_2b_vit_slim.axmodel -i examples/red-panda.mp4 -q "Please describe this video in detail."
|
| 169 |
-
[INFO] Available providers: ['AxEngineExecutionProvider']
|
| 170 |
-
θΎε
₯εΈ§ζ°: 8
|
| 171 |
-
preprocess image done!
|
| 172 |
-
[INFO] Chip type: ChipType.MC50
|
| 173 |
-
[INFO] VNPU type: VNPUType.DISABLED
|
| 174 |
-
[INFO] Engine version: 2.11.0a
|
| 175 |
-
vit_output.shape is (1, 256, 1536), vit feature extract done!
|
| 176 |
-
Init InferenceSession: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 28/28 [00:30<00:00, 1.07s/it]
|
| 177 |
-
model load done!
|
| 178 |
-
prefill token_len: 2159
|
| 179 |
-
slice_indexs is [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
|
| 180 |
-
slice prefill done 0
|
| 181 |
-
slice prefill done 1
|
| 182 |
-
slice prefill done 2
|
| 183 |
-
slice prefill done 3
|
| 184 |
-
slice prefill done 4
|
| 185 |
-
slice prefill done 5
|
| 186 |
-
slice prefill done 6
|
| 187 |
-
slice prefill done 7
|
| 188 |
-
slice prefill done 8
|
| 189 |
-
slice prefill done 9
|
| 190 |
-
slice prefill done 10
|
| 191 |
-
slice prefill done 11
|
| 192 |
-
slice prefill done 12
|
| 193 |
-
slice prefill done 13
|
| 194 |
-
slice prefill done 14
|
| 195 |
-
slice prefill done 15
|
| 196 |
-
slice prefill done 16
|
| 197 |
-
Decode: 88%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 2240/2559 [00:11<00:02, 133.83it/s]^@hit eos!
|
| 198 |
-
Decode: 90%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 2303/2559 [00:21<00:02, 108.19it/s]
|
| 199 |
-
The video features two red pandas in an outdoor enclosure with green grass and a wooden structure. One panda is perched on a branch, while the other stands on the ground.
|
| 200 |
-
The standing panda is holding a bamboo stick with its paws, attempting to eat it. The environment appears to be a zoo or wildlife sanctuary.
|
| 201 |
-
The lighting is natural daylight. The pandas have distinctive reddish-brown fur with black faces and white markings around their eyes.
|
| 202 |
-
The bamboo sticks are brown and appear to be part of the enclosure's enrichment. The panda on the ground seems to be trying to reach the bamboo,
|
| 203 |
-
while the one on the branch seems to be observing or waiting for its turn. There's no visible human interaction in these frames.
|
| 204 |
-
```
|
|
|
|
| 11 |
- InternVL3-2B
|
| 12 |
---
|
| 13 |
|
| 14 |
+
# InternVL3-2B
|
| 15 |
|
| 16 |
This version of InternVL3-2B has been converted to run on the Axera NPU using **w8a16** quantization.
|
| 17 |
|
|
|
|
| 21 |
|
| 22 |
## Convert tools links:
|
| 23 |
|
| 24 |
+
For those who are interested in model conversion, you can try to export axmodel through the original repo :
|
| 25 |
+
https://huggingface.co/OpenGVLab/InternVL3-2B
|
| 26 |
|
| 27 |
+
[How to Convert LLM from Huggingface to axmodel](https://github.com/AXERA-TECH/InternVL3-2B.axera/tree/master/model_convert)
|
| 28 |
+
|
| 29 |
+
[AXera NPU HOST LLM Runtime](https://github.com/AXERA-TECH/ax-llm/tree/ax-internvl)
|
| 30 |
+
|
| 31 |
+
[AXera NPU AXCL LLM Runtime](https://github.com/AXERA-TECH/ax-llm/tree/axcl-internvl)
|
| 32 |
|
| 33 |
## Support Platform
|
| 34 |
+
|
| 35 |
- AX650
|
| 36 |
- [M4N-Dock(η±θ―ζ΄ΎPro)](https://wiki.sipeed.com/hardware/zh/maixIV/m4ndock/m4ndock.html)
|
| 37 |
+
- [M.2 Accelerator card](https://axcl-docs.readthedocs.io/zh-cn/latest/doc_guide_hardware.html)
|
| 38 |
|
| 39 |
|chips|Image num|image encoder 448 | ttft | w8a16 |
|
| 40 |
|--|--|--|--|--|
|
| 41 |
+
|AX650N | 0 | 0 ms | 221 ms (128 tokens) | 10 tokens/sec |
|
| 42 |
+
|AX650N | 1 | 364 ms | 862 ms (384 tokens) | 10 tokens/sec |
|
| 43 |
+
|AX650N | 4 | 1456 ms | 4589 ms (1152 tokens) | 10 tokens/sec |
|
| 44 |
+
|AX650N | 8 | 2912 ms | 13904 ms (2176 tokens) | 10 tokens/sec |
|
| 45 |
|
| 46 |
## How to use
|
| 47 |
|
| 48 |
Download all files from this repository to the device.
|
| 49 |
|
|
|
|
|
|
|
| 50 |
```bash
|
| 51 |
+
(base) axera@raspberrypi:~/qtang/huggingface/AXERA-TECH/InternVL3-2B $ tree -L 1
|
| 52 |
.
|
| 53 |
βββ config.json
|
| 54 |
βββ examples
|
| 55 |
+
βββ gradio_demo_c_api.py
|
| 56 |
+
βββ gradio_demo_python_api.py
|
| 57 |
βββ infer.py
|
| 58 |
βββ infer_video.py
|
| 59 |
βββ internvl3_2b_axmodel
|
| 60 |
βββ internvl3_2b_tokenizer
|
| 61 |
+
βββ internvl3_tokenizer.py
|
| 62 |
+
βββ llm.py
|
| 63 |
+
βββ main_api_ax650
|
| 64 |
+
βββ main_api_axcl_aarch64
|
| 65 |
+
βββ main_api_axcl_x86
|
| 66 |
+
βββ main_ax650
|
| 67 |
+
βββ main_axcl_aarch64
|
| 68 |
+
βββ main_axcl_x86
|
| 69 |
+
βββ post_config.json
|
| 70 |
βββ README.md
|
| 71 |
+
βββ requirements.txt
|
| 72 |
+
βββ run_internvl_3_2b_448_api_ax650.sh
|
| 73 |
+
βββ run_internvl_3_2b_448_api_axcl_aarch64.sh
|
| 74 |
+
βββ run_internvl_3_2b_448_api_axcl_x86.sh
|
| 75 |
+
βββ run_internvl_3_2b_448_ax650.sh
|
| 76 |
+
βββ run_internvl_3_2b_448_axcl_aarch64.sh
|
| 77 |
+
βββ run_internvl_3_2b_448_axcl_x86.sh
|
| 78 |
βββ vit_axmodel
|
| 79 |
|
| 80 |
+
6 directories, 22 files
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 81 |
|
| 82 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
| 83 |
|
| 84 |
+
### python env requirement
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 85 |
|
| 86 |
+
#### pyaxengine
|
|
|
|
|
|
|
|
|
|
| 87 |
|
| 88 |
+
https://github.com/AXERA-TECH/pyaxengine
|
|
|
|
|
|
|
| 89 |
|
| 90 |
+
```
|
| 91 |
+
wget https://github.com/AXERA-TECH/pyaxengine/releases/download/0.1.3.rc1/axengine-0.1.3-py3-none-any.whl
|
| 92 |
+
pip install axengine-0.1.3-py3-none-any.whl
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 93 |
```
|
| 94 |
|
| 95 |
+
#### others
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 96 |
|
| 97 |
```
|
| 98 |
+
pip install -r requirements.txt
|
| 99 |
```
|
| 100 |
|
| 101 |
+
#### Inference with Raspberry Pi 5 Host using AXCL EP(such as M.2 AI Card or HAT AI Module)
|
| 102 |
|
| 103 |
+
```
|
| 104 |
+
cd InternVL3-2B
|
| 105 |
+
python gradio_demo_python_api.py --hf_model internvl3_2b_tokenizer/ \
|
| 106 |
+
--axmodel_path internvl3_2b_axmodel/ \
|
| 107 |
+
--vit_model vit_axmodel/internvl3_2b_vit_slim.axmodel
|
| 108 |
+
|
| 109 |
+
[INFO] Available providers: ['AXCLRTExecutionProvider']
|
| 110 |
+
Init InferenceSession: 0%| | 0/28 [00:00<?, ?it/s]
|
| 111 |
+
[INFO] Using provider: AXCLRTExecutionProvider
|
| 112 |
+
[INFO] SOC Name: AX650N
|
| 113 |
+
[INFO] VNPU type: VNPUType.DISABLED
|
| 114 |
+
[INFO] Compiler version: 3.4 162fdaa8
|
| 115 |
+
Init InferenceSession: 4%|ββββ | 1/28 [00:01<00:43, 1.61s/it]
|
| 116 |
+
[INFO] Using provider: AXCLRTExecutionProvider
|
| 117 |
+
......
|
| 118 |
+
[INFO] VNPU type: VNPUType.DISABLED
|
| 119 |
+
[INFO] Compiler version: 3.4 162fdaa8
|
| 120 |
+
Init InferenceSession: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 28/28 [00:34<00:00, 1.23s/it]
|
| 121 |
+
[INFO] Using provider: AXCLRTExecutionProvider
|
| 122 |
+
[INFO] SOC Name: AX650N
|
| 123 |
[INFO] VNPU type: VNPUType.DISABLED
|
| 124 |
+
[INFO] Compiler version: 3.4 162fdaa8
|
|
|
|
| 125 |
model load done!
|
| 126 |
+
[INFO] Using provider: AXCLRTExecutionProvider
|
| 127 |
+
[INFO] SOC Name: AX650N
|
| 128 |
+
[INFO] VNPU type: VNPUType.DISABLED
|
| 129 |
+
[INFO] Compiler version: 3.4 162fdaa8
|
| 130 |
+
chatbot = gr.Chatbot(height=650)
|
| 131 |
+
HTTP ζε‘ε°ε: http://xxx.xxx.xxx.xxx:7860
|
| 132 |
+
* Running on local URL: http://xxx.xxx.xxx.xxx:7860
|
| 133 |
+
* To create a public link, set `share=True` in `launch()`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 134 |
```
|
| 135 |
|
| 136 |
+
Access `http://xxx.xxx.xxx.xxx:7860` using Chrome or another browser.
|
| 137 |
|
| 138 |
+

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|