Instructions to use OpenGVLab/InternVL2-26B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use OpenGVLab/InternVL2-26B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="OpenGVLab/InternVL2-26B", trust_remote_code=True)
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("OpenGVLab/InternVL2-26B", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use OpenGVLab/InternVL2-26B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "OpenGVLab/InternVL2-26B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "OpenGVLab/InternVL2-26B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/OpenGVLab/InternVL2-26B

SGLang

How to use OpenGVLab/InternVL2-26B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "OpenGVLab/InternVL2-26B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "OpenGVLab/InternVL2-26B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "OpenGVLab/InternVL2-26B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "OpenGVLab/InternVL2-26B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use OpenGVLab/InternVL2-26B with Docker Model Runner:
```
docker model run hf.co/OpenGVLab/InternVL2-26B
```

HF Demo broken

by catworld1212 - opened Jul 5, 2024

Discussion

catworld1212

Jul 5, 2024

Hi, HF Demo broken its not working

czczup

OpenGVLab org Jul 7, 2024

Hello, thank you for your attention. For some special reason, we will restore the demo in the next few days.

catworld1212

Jul 7, 2024

@czczup I'm confused I did some visual answering with the 26b model and it performs very badly in that. The only model that passes that question are Gemini 1.5 pro/flash, gpt4-o, and Claude.

Then how InternVL2-26B was evaluated that it outperforms gpt4-v?

czczup

OpenGVLab org Jul 7, 2024

•

edited Jul 7, 2024

Thanks for your feedback. Could you give me some examples to reproduce this issue?

catworld1212

Jul 9, 2024

@czczup It was visual answering task, Can you tell me how to prompt this model to get the most accurate answer with human-like reasoning skills?

catworld1212

Jul 9, 2024

@czczup How a good prompt to this model look like?

czczup

OpenGVLab org Jul 11, 2024

•

edited Jul 11, 2024

@czczup I'm confused I did some visual answering with the 26b model and it performs very badly in that. The only model that passes that question are Gemini 1.5 pro/flash, gpt4-o, and Claude.

Then how InternVL2-26B was evaluated that it outperforms gpt4-v?

Thank you for this question.

We've achieved comparable performance to GPT-4V in many single-turn QA benchmarks. However, these results don't reflect other capabilities such as multi-turn conversation and instruction-following. (Of course, I think OpenAI also hasn't put their utmost effort into improving these academic benchmarks.)

For example, on the recently released ConvBench, which evaluates multi-turn dialogue, there is still a significant gap between our model and GPT-4V. This is mainly because most of our training data is sourced from open data. Despite our best efforts in data cleaning, the quality of our data is not as high as that of commercial companies.

As a non-profit research organization, OpenGVLab does not have the same funding as OpenAI for manually annotated data. Nevertheless, we still hope to do our best to improve the performance of open-source MLLMs.

catworld1212

Jul 11, 2024

@czczup Got it, When InternVL2-Pro is coming on HuggingFace?

czczup

OpenGVLab org Jul 16, 2024

We have recently open-sourced a 78B version, available at https://huggingface.co/OpenGVLab/InternVL2-Llama3-76B. This version achieves performance similar to InternVL2-Pro on most metrics. InternVL2-Pro is still undergoing our internal lab approval process, so it might take some time before it's available.

czczup changed discussion status to closed Jul 16, 2024

catworld1212

Jul 16, 2024

@czczup The 78B version also didn't perform too well it can't get the answer that I get with GPT-4o, Gemini.

Thoughts?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment