This model was converted to OpenVINO from Qwen/Qwen2-VL-2B-Instruct using optimum-intel
via the export space.
Install packages:
pip install optimum[openvino] transformers pillow torch
Sample code to analyze a local image file:
from optimum.intel import OVModelForVisualCausalLM
from transformers import AutoProcessor
from PIL import Image
MODEL_ID = "TheAverageDetective/Qwen2-VL-2B-Instruct-openvino"
image_path = "test.png"
# Load model and processor
model = OVModelForVisualCausalLM.from_pretrained(MODEL_ID, device="GPU")
processor = AutoProcessor.from_pretrained(MODEL_ID)
# Load image
image = Image.open(image_path).convert("RGB")
# Prepare messages
messages = [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": [
{"type": "image"},
{"type": "text", "text": "Describe this image in detail."}
]
}
]
# Process and generate
prompt_text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=[prompt_text], images=[image], return_tensors="pt")
output_ids = model.generate(**inputs, max_new_tokens=150)
result = processor.batch_decode(output_ids, skip_special_tokens=True)[0]
print("\n",result.split("assistant\n")[-1].strip())
Works on Intel Iris iGPU with 80EU and 16GB system RAM.
- Downloads last month
- 36