Instructions to use AIDC-AI/Ovis2-34B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use AIDC-AI/Ovis2-34B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="AIDC-AI/Ovis2-34B", trust_remote_code=True) messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("AIDC-AI/Ovis2-34B", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use AIDC-AI/Ovis2-34B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "AIDC-AI/Ovis2-34B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AIDC-AI/Ovis2-34B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/AIDC-AI/Ovis2-34B
- SGLang
How to use AIDC-AI/Ovis2-34B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "AIDC-AI/Ovis2-34B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AIDC-AI/Ovis2-34B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "AIDC-AI/Ovis2-34B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AIDC-AI/Ovis2-34B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use AIDC-AI/Ovis2-34B with Docker Model Runner:
docker model run hf.co/AIDC-AI/Ovis2-34B
Integration into ollama possible?
Dear Ovis developers,
I'm fairly new to integrating models provided on HuggingFace in my open-source framework using ollama and I'm uncertain whether it is my inexperience or a general incompatibility that I haven't managed to integrate Ovis2 in ollama yet.
I would love to try out your model over ollama and followed the section "Importing a model from Safetensors weights" from this ollama guidline:
https://github.com/ollama/ollama/blob/main/docs/import.md
in an attempt to integrate it as new model in ollama after downloading it from hugging face with:
"git clone https://huggingface.co/AIDC-AI/Ovis2-34B
"
Unfortunately I get the ollama error: Error: unsupported architecture.
Is this already wrong and i would have needed to follow the section "Importing a fine tuned adapter from Safetensors weights
" instead?
After some search I found this post:
https://github.com/ollama/ollama/issues/6231
stating that "Qwen2ForCausalLM" (your base model architecture) is not supported yet in the quantization method used by ollama which is based upon an older version of llama.cpp. However I tried the integration without any quantization options (assuming the default means no quantization) using only ollama create "name of modelfile". Is this still the reason for the architecture incompatibility?
Ollama provides several qwen LLMs presumably based on the same architecture which furthers my confusion.
Is it possible to circumvent this issue by doing a conversion (and potentially quantization) from safetensors to GGUF outside of ollama first by using a newer version of llama.cpp like it is demonstrated here:
https://github.com/ggml-org/llama.cpp/discussions/7927
or will the resulting gguf, once I try to integrate it following the section "Importing a GGUF based model or adapter" of the first mentioned link cause issues because the quantization is not understood by the older llama.cpp version ollama is built upon?
Hi, thank you for your inquiry.
Ollama does not currently support the Ovis architecture. If you're interested in using the model, please refer to the README for instructions on deploying it with Transformers.