AXERA-TECH
/

InternVL3-2B

@@ -11,7 +11,7 @@ tags:
   - InternVL3-2B
 ---
-# InternVL3-2B-Int8
 This version of InternVL3-2B has been converted to run on the Axera NPU using **w8a16** quantization.
@@ -21,184 +21,118 @@ Compatible with Pulsar2 version: 3.4
 ## Convert tools links:
-For those who are interested in model conversion, you can try to export axmodel through the original repo:
-https://huggingface.co/deepseek-ai/InternVL3-2B
-- [Github for InternVL3-2B.axera](https://github.com/AXERA-TECH/InternVL3-2B.axera)
-- [Pulsar2 Link, How to Convert LLM from Huggingface to axmodel](https://pulsar2-docs.readthedocs.io/en/latest/appendix/build_llm.html)
 ## Support Platform
 - AX650
   - [M4N-Dock(爱芯派Pro)](https://wiki.sipeed.com/hardware/zh/maixIV/m4ndock/m4ndock.html)
 |chips|Image num|image encoder 448 | ttft | w8a16 |
 |--|--|--|--|--|
-|AX650N | 0 | 0 ms | 221 ms (128 tokens) | 11.50 tokens/sec |
-|AX650N | 1 | 364 ms | 862 ms (384 tokens) | 11.50 tokens/sec |
-|AX650N | 4 | 1456 ms | 4589 ms (1152 tokens) | 11.50 tokens/sec |
-|AX650N | 8 | 2912 ms | 13904 ms (2176 tokens) | 11.50 tokens/sec |
 ## How to use
 Download all files from this repository to the device.
-**Using AX650 Board**
 ```bash
-root@ax650 ~/yongqiang/push_hugging_face/InternVL3-2B # tree -L 1
 .
 ├── config.json
 ├── examples
 ├── infer.py
 ├── infer_video.py
 ├── internvl3_2b_axmodel
 ├── internvl3_2b_tokenizer
 ├── README.md
 └── vit_axmodel
-4 directories, 4 files
-```
-#### Inference with AX650 Host, such as M4N-Dock(爱芯派Pro) or AX650N DEMO Board
-**Text Generation**
-input text:
 ```
-Please calculate the derivative of the function [y=2x^ 2-2] and provide the reasoning process in markdown format.
-```
-log information:
-```bash
-root@ax650 ~/yongqiang/push_hugging_face/InternVL3-2B # python3 infer.py --hf_model internvl3_2b_tokenizer/ --axmodel_path internvl3_2b_axmodel/ --question "Please calculate the derivative of the function [y=2x^ 2-2] and provide the reasoning process in markdown format"
-Init InferenceSession: 100%|██████████████████████████████████████████████████████████| 28/28 [00:16<00:00,  1.74it/s]
-model load done!
-prefill token_len:  85
-slice_indexs is [0]
-slice prefill done 0
-Decode:   9%|██████▎                                                               | 232/2559 [00:19<05:14,  7.39it/s]
-Decode:  17%|████████████                                                          | 440/2559 [00:48<04:51,  7.26it/s]hit eos!
-Decode:  17%|████████████                                                          | 440/2559 [00:48<03:53,  9.06it/s]
-Certainly! Let's calculate the derivative of the function \( y = 2x^2 - 2 \ \ using the rules of differentiation.
-### Step-by-Step Reasoning:
-1. **Identify the Function:**
-   The given function is \( y = 2x^2 - 2 \\).
-2. **Differentiate Term by Term:**
-   We will differentiate each term of the function separately.
-   - **First Term: \( 2x^2 \ \**
-     - The derivative of \( x^n \ \ (where n is a constant) is \( nx^{n-1} \ \).
-     - Here, \( n = 2 \ \.
-     - Therefore, the derivative of \( 2x^2 \ \ is \( 2 \ \ times \( 2x^{2-1} \ \ which simplifies to \( 4x \ \.
-   - **Second Term: \( -2 \ \**
-     - The derivative of a constant (a term without \( x \\ is 0 \.
-     - Therefore, the derivative of \( -2 \ \ is \( 0 \.
-3. **Combine the Derivatives:**
-   - The derivative of the entire function is the sum of the derivatives of each term.
-   - So, the derivative of \( y = 2x^2 - 2 \\ is \( 4x + 0 \\ which simplifies to \( 4x \.
-### Final Answer:
-The derivative of the function \( y = 2x^2 - 2 \ is \( 4x \.
-### Summary:
-The derivative of \( y = 2x^2 - 2 \ is \( 4x \.
 ```
-**Multimodal Understanding**
-input image
-![](examples/image_1.jpg)
-input text:
 ```
-"Please describe this picture in detail."
 ```
-log information:
-```bash
-root@ax650 ~/yongqiang/push_hugging_face/InternVL3-2B # python3 infer.py --hf_model internvl3_2b_tokenizer/ --axmodel_path internvl3_2b_axmodel/ --question "Please describe this picture in detail" -i examples/image_1.jpg --vit_model vit_axmodel/internvl3_2b_vit_slim.axmodel
-[INFO] Available providers:  ['AxEngineExecutionProvider']
-Init InferenceSession:   0%|                                                                   | 0/24 [00:00<?, ?it/s][INFO] Chip type: ChipType.MC50
 [INFO] VNPU type: VNPUType.DISABLED
-[INFO] Engine version: 2.11.0a
-Init InferenceSession: 100%|██████████████████████████████████████████████████████████| 28/28 [00:14<00:00,  1.92it/s]
 model load done!
-prefill token_len:  325
-slice_indexs is [0, 1, 2]
-slice prefill done 0
-slice prefill done 1
-slice prefill done 2
-Decode:  13%|████████▋                                                           | 326/2559 [00:00<00:01, 1829.15it/s]
-Decode:  19%|█████████████▍                                                        | 489/2559 [00:22<02:26, 14.17it/s]hit eos!
-Decode:  20%|██████████████▏                                                       | 517/2559 [00:26<01:43, 19.71it/s]
-**Image Description:**
-The image depicts a giant panda in a naturalistic enclosure, likely within a zoo or wildlife sanctuary. The panda is prominently positioned in the foreground, surrounded by lush green bamboo plants. Its distinctive black and white fur is clearly visible,
-with the panda's face, ears, and limbs being black, while its body and the rest of its face are white. The panda appears to be eating bamboo, with its front paws holding a piece of bamboo close to its mouth. The panda's expression is calm and curious, with its eyes looking directly at the camera.
-In the background, there is another panda partially obscured by the foliage and a wooden structure, possibly part of the enclosure's design. The ground is covered with a layer of mulch or wood chips, providing a naturalistic habitat for the pandas. The overall setting is serene and well-maintained,
-designed to mimic the panda's natural habitat while ensuring the animals' well-being.
-```
-input video
-https://github.com/user-attachments/assets/2beffc73-d078-4c54-8282-7b7d845f39c9
-input text:
-```
-"Please describe this video in detail."
 ```
-log information:
-```bash
-root@ax650 ~/yongqiang/push_hugging_face/InternVL3-2B # python3 infer_video.py --hf_model internvl3_2b_tokenizer/ --axmodel_path internvl3_2b_axmodel/ --vit_model vit_axmodel/internvl3_2b_vit_slim.axmodel -i examples/red-panda.mp4  -q "Please describe this video in detail."
-[INFO] Available providers:  ['AxEngineExecutionProvider']
-输入帧数: 8
-preprocess image done!
-[INFO] Chip type: ChipType.MC50
-[INFO] VNPU type: VNPUType.DISABLED
-[INFO] Engine version: 2.11.0a
-vit_output.shape is (1, 256, 1536), vit feature extract done!
-Init InferenceSession: 100%|██████████████████████████████████████████████████████████| 28/28 [00:30<00:00,  1.07s/it]
-model load done!
-prefill token_len:  2159
-slice_indexs is [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
-slice prefill done 0
-slice prefill done 1
-slice prefill done 2
-slice prefill done 3
-slice prefill done 4
-slice prefill done 5
-slice prefill done 6
-slice prefill done 7
-slice prefill done 8
-slice prefill done 9
-slice prefill done 10
-slice prefill done 11
-slice prefill done 12
-slice prefill done 13
-slice prefill done 14
-slice prefill done 15
-slice prefill done 16
-Decode:  88%|███████████████████████████████████████████████████████████▌        | 2240/2559 [00:11<00:02, 133.83it/s]^@hit eos!
-Decode:  90%|█████████████████████████████████████████████████████████████▏      | 2303/2559 [00:21<00:02, 108.19it/s]
-The video features two red pandas in an outdoor enclosure with green grass and a wooden structure. One panda is perched on a branch, while the other stands on the ground.
-The standing panda is holding a bamboo stick with its paws, attempting to eat it. The environment appears to be a zoo or wildlife sanctuary.
-The lighting is natural daylight. The pandas have distinctive reddish-brown fur with black faces and white markings around their eyes.
-The bamboo sticks are brown and appear to be part of the enclosure's enrichment. The panda on the ground seems to be trying to reach the bamboo,
-while the one on the branch seems to be observing or waiting for its turn. There's no visible human interaction in these frames.
-```

   - InternVL3-2B
 ---
+# InternVL3-2B
 This version of InternVL3-2B has been converted to run on the Axera NPU using **w8a16** quantization.
 ## Convert tools links:
+For those who are interested in model conversion, you can try to export axmodel through the original repo :
+https://huggingface.co/OpenGVLab/InternVL3-2B
+[How to Convert LLM from Huggingface to axmodel](https://github.com/AXERA-TECH/InternVL3-2B.axera/tree/master/model_convert)
+[AXera NPU HOST LLM Runtime](https://github.com/AXERA-TECH/ax-llm/tree/ax-internvl)
+[AXera NPU AXCL LLM Runtime](https://github.com/AXERA-TECH/ax-llm/tree/axcl-internvl)
 ## Support Platform
 - AX650
   - [M4N-Dock(爱芯派Pro)](https://wiki.sipeed.com/hardware/zh/maixIV/m4ndock/m4ndock.html)
+  - [M.2 Accelerator card](https://axcl-docs.readthedocs.io/zh-cn/latest/doc_guide_hardware.html)
 |chips|Image num|image encoder 448 | ttft | w8a16 |
 |--|--|--|--|--|
+|AX650N | 0 | 0 ms | 221 ms (128 tokens) | 10 tokens/sec |
+|AX650N | 1 | 364 ms | 862 ms (384 tokens) | 10 tokens/sec |
+|AX650N | 4 | 1456 ms | 4589 ms (1152 tokens) | 10 tokens/sec |
+|AX650N | 8 | 2912 ms | 13904 ms (2176 tokens) | 10 tokens/sec |
 ## How to use
 Download all files from this repository to the device.
 ```bash
+(base) axera@raspberrypi:~/qtang/huggingface/AXERA-TECH/InternVL3-2B $ tree -L 1
 .
 ├── config.json
 ├── examples
+├── gradio_demo_c_api.py
+├── gradio_demo_python_api.py
 ├── infer.py
 ├── infer_video.py
 ├── internvl3_2b_axmodel
 ├── internvl3_2b_tokenizer
+├── internvl3_tokenizer.py
+├── llm.py
+├── main_api_ax650
+├── main_api_axcl_aarch64
+├── main_api_axcl_x86
+├── main_ax650
+├── main_axcl_aarch64
+├── main_axcl_x86
+├── post_config.json
 ├── README.md
+├── requirements.txt
+├── run_internvl_3_2b_448_api_ax650.sh
+├── run_internvl_3_2b_448_api_axcl_aarch64.sh
+├── run_internvl_3_2b_448_api_axcl_x86.sh
+├── run_internvl_3_2b_448_ax650.sh
+├── run_internvl_3_2b_448_axcl_aarch64.sh
+├── run_internvl_3_2b_448_axcl_x86.sh
 └── vit_axmodel
+6 directories, 22 files
 ```
+### python env requirement
+#### pyaxengine
+https://github.com/AXERA-TECH/pyaxengine
+```
+wget https://github.com/AXERA-TECH/pyaxengine/releases/download/0.1.3.rc1/axengine-0.1.3-py3-none-any.whl
+pip install axengine-0.1.3-py3-none-any.whl
 ```
+#### others
 ```
+pip install -r requirements.txt
 ```
+#### Inference with Raspberry Pi 5 Host using AXCL EP(such as M.2 AI Card or HAT AI Module)
+```
+cd InternVL3-2B
+python gradio_demo_python_api.py --hf_model internvl3_2b_tokenizer/ \
+                                 --axmodel_path internvl3_2b_axmodel/ \
+                                 --vit_model vit_axmodel/internvl3_2b_vit_slim.axmodel
+[INFO] Available providers:  ['AXCLRTExecutionProvider']
+Init InferenceSession:   0%|                                                                                 | 0/28 [00:00<?, ?it/s]
+[INFO] Using provider: AXCLRTExecutionProvider
+[INFO] SOC Name: AX650N
+[INFO] VNPU type: VNPUType.DISABLED
+[INFO] Compiler version: 3.4 162fdaa8
+Init InferenceSession:   4%|███▏                                                                             | 1/28 [00:01<00:43,  1.61s/it]
+[INFO] Using provider: AXCLRTExecutionProvider
+......
+[INFO] VNPU type: VNPUType.DISABLED
+[INFO] Compiler version: 3.4 162fdaa8
+Init InferenceSession: 100%|█████████████████████████████████████████████████████████████████████████████████████████| 28/28 [00:34<00:00,  1.23s/it]
+[INFO] Using provider: AXCLRTExecutionProvider
+[INFO] SOC Name: AX650N
 [INFO] VNPU type: VNPUType.DISABLED
+[INFO] Compiler version: 3.4 162fdaa8
 model load done!
+[INFO] Using provider: AXCLRTExecutionProvider
+[INFO] SOC Name: AX650N
+[INFO] VNPU type: VNPUType.DISABLED
+[INFO] Compiler version: 3.4 162fdaa8
+  chatbot = gr.Chatbot(height=650)
+HTTP 服务地址: http://xxx.xxx.xxx.xxx:7860
+* Running on local URL:  http://xxx.xxx.xxx.xxx:7860
+* To create a public link, set `share=True` in `launch()`.
 ```
+Access `http://xxx.xxx.xxx.xxx:7860` using Chrome or another browser.
+![](webgui.png)