File size: 1,678 Bytes
8605312
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
---
language:
- zh
- en
base_model:
- openbmb/MiniCPM-V-4
pipeline_tag: image-text-to-text
library_name: transformers
tags:
- MiniCPM
- MiniCPM-V-4
---

# MiniCPM-V-4

## Convert tools links:

For those who are interested in model conversion, you can try to export axmodel through the original repo : 
https://huggingface.co/openbmb/MiniCPM-V-4

[How to Convert LLM from Huggingface to axmodel](https://github.com/Jordan-5i/MiniCPM-o/blob/main/ax_convert/readme.md) 

## Support Platform

- AX650
  - AX650N DEMO Board
  - [M4N-Dock(爱芯派Pro)](https://wiki.sipeed.com/hardware/zh/maixIV/m4ndock/m4ndock.html)
  - [M.2 Accelerator card](https://axcl-docs.readthedocs.io/zh-cn/latest/doc_guide_hardware.html)


## How to use

Download all files from this repository to the device

```
root@ax650:~/wangjian/minicpm-v-4# tree -L 1
.
β”œβ”€β”€ embed_tokens.pth
β”œβ”€β”€ minicpm-v-4_axmodel
β”œβ”€β”€ minicpmv4_tokenizer
β”œβ”€β”€ resampler.axmodel
β”œβ”€β”€ run_axmodel.py
β”œβ”€β”€ show_demo.jpg
└── siglip.axmodel
```
install transformers

```
pip install transformers==4.51.0
```

## Inference with AX650 Host on AX650 DEMO Board

run following cmd:

```bash
python3 run_axmodel.py -i show_demo.jpg -q "What is the landform in the picture?"
```
input image:
![demo.jpg](./show_demo.jpg)

minicpm-v-4 output:

```bash
question1 = "What is the landform in the picture?"

answer1 = The landform in the picture is a karst topography, characterized by its unique and dramatic appearance with steep limestone cliffs rising from the water' s surface. This type of landscape is commonly found in regions with significant geological activity, such as China's Li River.
```