Instructions to use OpenGVLab/InternVL2-Llama3-76B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use OpenGVLab/InternVL2-Llama3-76B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="OpenGVLab/InternVL2-Llama3-76B", trust_remote_code=True)
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("OpenGVLab/InternVL2-Llama3-76B", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use OpenGVLab/InternVL2-Llama3-76B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "OpenGVLab/InternVL2-Llama3-76B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "OpenGVLab/InternVL2-Llama3-76B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/OpenGVLab/InternVL2-Llama3-76B

SGLang

How to use OpenGVLab/InternVL2-Llama3-76B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "OpenGVLab/InternVL2-Llama3-76B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "OpenGVLab/InternVL2-Llama3-76B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "OpenGVLab/InternVL2-Llama3-76B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "OpenGVLab/InternVL2-Llama3-76B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use OpenGVLab/InternVL2-Llama3-76B with Docker Model Runner:
```
docker model run hf.co/OpenGVLab/InternVL2-Llama3-76B
```

Settings used for these benchmarks

by yNilay - opened Jul 17, 2024

Discussion

yNilay

Jul 17, 2024

I'm a big fan of OpenGVLab's work, particularly InternVL2, which I'm interested in using for an AI assistant project I'm developing. Currently, I'm using GPT-4o and Gemini Pro 1.5, but their closed-source nature makes them quite expensive.

I've seen the InternVL2-pro benchmark results showing it outperforming GPT-4o and Gemini Pro 1.5, which is impressive. However, when I tested the model for my specific use case, which requires strong reasoning skills and image understanding, it didn't perform as well as I expected.

I'm curious about the settings used for these benchmarks. Could you provide more information on the testing parameters and conditions?

Thank you for your work guys, I appreciate it!

czczup

OpenGVLab org Jul 23, 2024

•

edited Jul 23, 2024

Hi, thank you for your interest.

Our latest eval code is open-sourced in the InternVL repository. The specific test setup is consistent with InternVL 1.5, and you can refer to the tutorial here or here.

yNilay

Jul 23, 2024

@czczup I tried using InternVL2-pro and it can't answer some basic questions, I'd like to share you some examples. what's the best way to reach out to you?

czczup

OpenGVLab org Jul 23, 2024

@czczup I tried using InternVL2-pro and it can't answer some basic questions, I'd like to share you some examples. what's the best way to reach out to you?

My email is wztxy89@163.com

yNilay

Jul 24, 2024

@czczup I tried using InternVL2-pro and it can't answer some basic questions, I'd like to share you some examples. what's the best way to reach out to you?

My email is wztxy89@163.com

@czczup I sent an email please check :)

czczup changed discussion status to closed Jul 29, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment