|
|
--- |
|
|
library_name: transformers |
|
|
license: bsd-3-clause |
|
|
base_model: |
|
|
- OpenGVLab/InternVL3_5-2B |
|
|
tags: |
|
|
- InternVL3 |
|
|
- InternVL3_5-2B |
|
|
- Int8 |
|
|
- VLM |
|
|
pipeline_tag: image-text-to-text |
|
|
language: |
|
|
- en |
|
|
--- |
|
|
|
|
|
# InternVL3_5-2B |
|
|
|
|
|
This version of InternVL3_5-2B has been converted to run on the Axera NPU using **w8a16** quantization. |
|
|
|
|
|
This model has been optimized with the following LoRA: |
|
|
|
|
|
Compatible with Pulsar2 version: 5.1-patch1. |
|
|
|
|
|
Please note that the context of the model is 2k and the maximum prefill length is 1k. |
|
|
|
|
|
## Convert tools links: |
|
|
|
|
|
For those who are interested in model conversion, you can try to export axmodel through the original repo: |
|
|
|
|
|
https://huggingface.co/OpenGVLab/InternVL3_5-2B |
|
|
|
|
|
[How to Convert LLM from Huggingface to axmodel](https://github.com/AXERA-TECH/InternVL3_5-2B.axera/tree/main/model_convert) |
|
|
|
|
|
[AXera NPU HOST LLM Runtime](https://github.com/AXERA-TECH/ax-llm/tree/ax-internvl) |
|
|
|
|
|
[AXera NPU AXCL LLM Runtime](https://github.com/AXERA-TECH/ax-llm/tree/axcl-internvl) |
|
|
|
|
|
## Support Platform |
|
|
|
|
|
- AX650 |
|
|
- AX650N DEMO Board |
|
|
- [M4N-Dock(爱芯派Pro)](https://wiki.sipeed.com/hardware/zh/maixIV/m4ndock/m4ndock.html) |
|
|
- [M.2 Accelerator card](https://axcl-docs.readthedocs.io/zh-cn/latest/doc_guide_hardware.html) |
|
|
|
|
|
|Chips|image encoder 448|ttft|w8a16| |
|
|
|--|--|--|--| |
|
|
|AX650| 364.412 ms | 5844 ms | 9.52 tokens/sec| |
|
|
|
|
|
|
|
|
## How to use |
|
|
|
|
|
Download all files from this repository to the device |
|
|
|
|
|
``` |
|
|
$ tree -L 1 |
|
|
. |
|
|
├── assets |
|
|
├── config.json |
|
|
├── examples |
|
|
├── gradio_demo.py |
|
|
├── infer_axmodel.py |
|
|
├── infer_torch.py |
|
|
├── internvl3-5_axmodel |
|
|
├── internvl3-5_tokenizer |
|
|
├── README.md |
|
|
├── utils |
|
|
└── vit-models |
|
|
|
|
|
6 directories, 5 files |
|
|
``` |
|
|
|
|
|
#### Install transformer |
|
|
|
|
|
``` |
|
|
pip install transformers==4.57.1 |
|
|
``` |
|
|
|
|
|
#### Inference with AX650 Host, such as M4N-Dock(爱芯派Pro) or AX650 DEMO Board |
|
|
|
|
|
Interactive conversations using the `Gradio API`: |
|
|
|
|
|
```bash |
|
|
$ python3 gradio_demo.py --hf_model internvl3-5_tokenizer/ --axmodel_path internvl3-5_axmodel/ --vit_model vit-models/internvl_vit_model_1x3x448x448.axmodel |
|
|
``` |
|
|
|
|
|
Plain text dialogue: |
|
|
|
|
|
 |
|
|
|
|
|
Image understanding: |
|
|
|
|
|
 |
|
|
|
|
|
--- |
|
|
|
|
|
Run the following command on the Axera board to start a chat conversation: |
|
|
|
|
|
```sh |
|
|
$ python3 infer_axmodel.py --hf_model internvl3-5_tokenizer/ --axmodel_path internvl3-5_axmodel/ --question "请计算函数[y=2x^2+2]的导数, 并提供 markdown 格式的推理过程" |
|
|
``` |
|
|
|
|
|
output: |
|
|
|
|
|
```bash |
|
|
[INFO] Using provider: AxEngineExecutionProvider |
|
|
[INFO] Model type: 2 (triple core) |
|
|
[INFO] Compiler version: 5.1-dirty 0fdbfe15-dirty |
|
|
Model loaded successfully! |
|
|
slice_indices: [0] |
|
|
Slice prefill done: 0 |
|
|
answer >> 函数 \( y = 2x^2 + 2 \) 的导数可以通过求导法则来计算。首先,我们对函数中的每一项分别求导: |
|
|
|
|
|
1. 对于 \( 2x^2 \),使用幂法则求导: |
|
|
\[ |
|
|
\frac{d}{dx}(2x^2) = 2 \cdot 2x = 4x |
|
|
\] |
|
|
|
|
|
2. 对于常数项 \( 2 \),其导数为 0,因为常数的导数为 0。 |
|
|
|
|
|
将这两部分的结果相加,得到函数 \( y \) 的导数: |
|
|
\[ |
|
|
y' = 4x |
|
|
\] |
|
|
|
|
|
因此,函数 \( y = 2x^2 + 2 \) 的导数为 \( y' = 4x \)。 |
|
|
``` |
|
|
|
|
|
Enter the following command to perform the single-image understanding task: |
|
|
|
|
|
```sh |
|
|
$ python3 infer_axmodel.py --hf_model internvl3-5_tokenizer/ --axmodel_path internvl3-5_axmodel/ --question "请描述这幅图" -i examples/image_0.jpg --vit_model vit-models/internvl_vit_model_1x3x448x448.axmodel |
|
|
``` |
|
|
|
|
|
 |
|
|
|
|
|
output: |
|
|
|
|
|
```bash |
|
|
[INFO] Model type: 2 (triple core) |
|
|
[INFO] Compiler version: 5.1-dirty 0fdbfe15-dirty |
|
|
Model loaded successfully! |
|
|
slice_indices: [0, 1, 2] |
|
|
Slice prefill done: 0 |
|
|
Slice prefill done: 1 |
|
|
Slice prefill done: 2 |
|
|
answer >> 这是一张红熊猫的照片。红熊猫是一种红棕色的哺乳动物,通常生活在亚洲的森林中。它们以捕食昆虫和小型无脊椎动物为生。图片中,红熊猫正坐在一个木制的平台上,背景是绿色的树木和植被,显得非常自然和生动。红熊猫的表情看起来很友好,似乎在观察或等待什么。 |
|
|
``` |
|
|
|