Configuration Parsing Warning: Invalid JSON for config file config.json

SigLIP2

SigLIP 2 extends the pretraining objective of SigLIP with prior, independently developed techniques into a unified recipe, for improved semantic understanding, localization, and dense features. You can use the raw model for tasks like zero-shot image classification and image-text retrieval, or as a vision encoder for VLMs (and other vision tasks).

The Original repo is https://huggingface.co/google/siglip2-base-patch16-224.

This model of SigLIP has been converted to run on the Axera NPU using w8a16 quantization.

This model has been optimized with the following LoRA:

Compatible with Pulsar2 version: 5.1

Convert tools links:

For those who are interested in model conversion, you can try to export axmodel through

Support Platform

Models latency
Image Encoder 11.1ms
Text Encoder 4.56ms

How to use

Download all files from this repository to the device

root@ax650:~/siglip2-base-patch16-224# tree -L 2
.
├── 000000039769.jpg
├── README.md
├── ax650
│   ├── siglip2-base-patch16-224_text.axmodel
│   └── siglip2-base-patch16-224_vision.axmodel
├── config.json
├── model_convert
│   ├── imagenet-calib.tar
│   ├── siglip2-base-patch16-224_text.json
│   └── siglip2-base-patch16-224_vision.json
├── onnx
│   ├── siglip2-base-patch16-224_text.onnx
│   └── siglip2-base-patch16-224_vision.onnx
├── python
│   ├── axmodel_infer.py
│   ├── export_onnx.py
│   ├── onnx_infer.py
│   ├── requirements.txt
│   └── test.py
└── tokenizer
    ├── config.json
    ├── preprocessor_config.json
    ├── special_tokens_map.json
    ├── tokenizer.json
    └── tokenizer_config.json

5 directories, 20 files

python env requirement

pyaxengine

https://github.com/AXERA-TECH/pyaxengine

wget https://github.com/AXERA-TECH/pyaxengine/releases/download/0.1.3rc0/axengine-0.1.3-py3-none-any.whl
pip install axengine-0.1.3-py3-none-any.whl

others

pip install -r python/requirements.txt

Inputs

Test

"a photo of 2 cats", "a photo of 2 dogs"

Image

Inference with AX650 Host, such as M4N-Dock(爱芯派Pro)

root@ax650:~/siglip2-base-patch16-224# python3 python/axmodel_infer.py
[INFO] Available providers:  ['AxEngineExecutionProvider']
[INFO] Using provider: AxEngineExecutionProvider
[INFO] Chip type: ChipType.MC50
[INFO] VNPU type: VNPUType.DISABLED
[INFO] Engine version: 2.12.0s
[INFO] Model type: 2 (triple core)
[INFO] Compiler version: 5.1-patch1 430ee3be
[INFO] Using provider: AxEngineExecutionProvider
[INFO] Model type: 2 (triple core)
[INFO] Compiler version: 5.1-patch1 430ee3be
[[1.0596762e-01 1.9978019e-05]]
10.6% that image 0 is 'a photo of 2 cats'
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AXERA-TECH/siglip2-base-patch16-224

Quantized
(5)
this model