Instructions to use AIDC-AI/Ovis2-34B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use AIDC-AI/Ovis2-34B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="AIDC-AI/Ovis2-34B", trust_remote_code=True)
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("AIDC-AI/Ovis2-34B", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use AIDC-AI/Ovis2-34B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "AIDC-AI/Ovis2-34B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AIDC-AI/Ovis2-34B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/AIDC-AI/Ovis2-34B

SGLang

How to use AIDC-AI/Ovis2-34B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "AIDC-AI/Ovis2-34B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AIDC-AI/Ovis2-34B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "AIDC-AI/Ovis2-34B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AIDC-AI/Ovis2-34B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use AIDC-AI/Ovis2-34B with Docker Model Runner:
```
docker model run hf.co/AIDC-AI/Ovis2-34B
```

Integration into ollama possible?

by hheimel - opened Mar 18, 2025

Discussion

hheimel

Mar 18, 2025

Dear Ovis developers,

I'm fairly new to integrating models provided on HuggingFace in my open-source framework using ollama and I'm uncertain whether it is my inexperience or a general incompatibility that I haven't managed to integrate Ovis2 in ollama yet.
I would love to try out your model over ollama and followed the section "Importing a model from Safetensors weights" from this ollama guidline:

https://github.com/ollama/ollama/blob/main/docs/import.md

in an attempt to integrate it as new model in ollama after downloading it from hugging face with:

"git clone https://huggingface.co/AIDC-AI/Ovis2-34B
"
Unfortunately I get the ollama error: Error: unsupported architecture.

Is this already wrong and i would have needed to follow the section "Importing a fine tuned adapter from Safetensors weights
" instead?

After some search I found this post:

https://github.com/ollama/ollama/issues/6231

stating that "Qwen2ForCausalLM" (your base model architecture) is not supported yet in the quantization method used by ollama which is based upon an older version of llama.cpp. However I tried the integration without any quantization options (assuming the default means no quantization) using only ollama create "name of modelfile". Is this still the reason for the architecture incompatibility?

Ollama provides several qwen LLMs presumably based on the same architecture which furthers my confusion.

Is it possible to circumvent this issue by doing a conversion (and potentially quantization) from safetensors to GGUF outside of ollama first by using a newer version of llama.cpp like it is demonstrated here:

https://github.com/ggml-org/llama.cpp/discussions/7927

or will the resulting gguf, once I try to integrate it following the section "Importing a GGUF based model or adapter" of the first mentioned link cause issues because the quantization is not understood by the older llama.cpp version ollama is built upon?

xxyyy123

AIDC-AI org Mar 18, 2025

Hi, thank you for your inquiry.

Ollama does not currently support the Ovis architecture. If you're interested in using the model, please refer to the README for instructions on deploying it with Transformers.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment