--- license: bsd-3-clause language: - en - zh base_model: - HuggingFaceTB/SmolVLM2-500M-Video-Instruct pipeline_tag: visual-question-answering tags: - HuggingFaceTB - SmolVLM2-500M-Video-Instruct --- # SmolVLM2-500M-Video-Instruct-Int8 This version of SmolVLM2-500M-Video-Instruct has been converted to run on the Axera NPU using **w8a16** quantization. Compatible with Pulsar2 version: 4.0 ## Convert tools links: For those who are interested in model conversion, you can try to export axmodel through the original repo: - https://huggingface.co/HuggingFaceTB/SmolVLM2-500M-Video-Instruct - [Github for SmolVLM2-500M-Video-Instruct.axera](https://github.com/AXERA-TECH/SmolVLM2-500M-Video-Instruct.axera) - [Pulsar2 Link, How to Convert LLM from Huggingface to axmodel](https://pulsar2-docs.readthedocs.io/en/latest/appendix/build_llm.html) ## Support Platform - AX650 - [M4N-Dock(爱芯派Pro)](https://wiki.sipeed.com/hardware/zh/maixIV/m4ndock/m4ndock.html) ## How to use Download all files from this repository to the device. **Using AX650 Board** ```bash ai@ai-bj ~/yongqiang/SmolVLM2-500M-Video-Instruct $ tree -L 1 . ├── assets ├── embeds ├── infer_axmodel.py ├── README.md ├── smolvlm2_axmodel ├── smolvlm2_tokenizer └── vit_mdoel 5 directories, 2 files ``` #### Inference with AX650 Host, such as M4N-Dock(爱芯派Pro) or AX650N DEMO Board **Multimodal Understanding** input image ![](assets/bee.jpg) input text: ``` Can you describe this image? ``` log information: ```bash ai@ai-bj ~/yongqiang/SmolVLM2-500M-Video-Instruct $ python3 infer_axmodel.py input prompt: Can you describe this image? answer >> The image depicts a close-up view of a pink flower with a bee on it. The bee, which appears to be a bumblebee, is perched on the flower's center, which is surrounded by a cluster of other flowers. The bee is in the process of collecting nectar from the flower, which is a common behavior for bees. The flower itself has a yellow center with a cluster of yellow stamens surrounding it. The petals of the flower are a vibrant shade of pink, and the bee is positioned very close to^@ the camera, making it the focal point of the image. The background of the image is slightly blurred, but it appears to be a garden or a field with other flowers and plants, contributing to the overall natural setting of the image. ```