Instructions to use TIGER-Lab/VLM2Vec-Full with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use TIGER-Lab/VLM2Vec-Full with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="TIGER-Lab/VLM2Vec-Full", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("TIGER-Lab/VLM2Vec-Full", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use TIGER-Lab/VLM2Vec-Full with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "TIGER-Lab/VLM2Vec-Full"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "TIGER-Lab/VLM2Vec-Full",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/TIGER-Lab/VLM2Vec-Full

SGLang

How to use TIGER-Lab/VLM2Vec-Full with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "TIGER-Lab/VLM2Vec-Full" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "TIGER-Lab/VLM2Vec-Full",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "TIGER-Lab/VLM2Vec-Full" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "TIGER-Lab/VLM2Vec-Full",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use TIGER-Lab/VLM2Vec-Full with Docker Model Runner:
```
docker model run hf.co/TIGER-Lab/VLM2Vec-Full
```

wenhu commited on Oct 10, 2024

Commit

bf0c5a7

verified ·

1 Parent(s): fb30ad8

Update processing_phi3_v.py

Browse files

Files changed (1) hide show

processing_phi3_v.py +9 -11

processing_phi3_v.py CHANGED Viewed

@@ -328,13 +328,13 @@ class Phi3VProcessor(ProcessorMixin):
         self.img_tokens = [f"<|image_{i + 1}|>" for i in range(1000000)]
     def __call__(
-            self,
-            text: Union[TextInput, List[TextInput]],
-            images: ImageInput = None,
-            padding: Union[bool, str, PaddingStrategy] = False,
-            truncation: Union[bool, str, TruncationStrategy] = None,
-            max_length=None,
-            return_tensors: Optional[Union[str, TensorType]] = TensorType.PYTORCH,
     ) -> BatchFeature:
         """
         Main method to prepare for the model one or several sequences(s) and image(s). This method forwards the `text`
@@ -415,11 +415,9 @@ class Phi3VProcessor(ProcessorMixin):
     def get_special_image_token_id(self):
         return self.tokenizer.convert_tokens_to_ids(self.special_image_token)
-    def _convert_images_texts_to_inputs(self, images, texts, padding=False, truncation=None, max_length=None,
-                                        return_tensors=None):
         if not len(images):
-            model_inputs = self.tokenizer(texts, return_tensors=return_tensors, padding=padding, truncation=truncation,
-                                          max_length=max_length)
             return BatchFeature(data={**model_inputs})
         pattern = r"<\|image_\d+\|>"

         self.img_tokens = [f"<|image_{i + 1}|>" for i in range(1000000)]
     def __call__(
+        self,
+        text: Union[TextInput, List[TextInput]],
+        images: ImageInput = None,
+        padding: Union[bool, str, PaddingStrategy] = False,
+        truncation: Union[bool, str, TruncationStrategy] = None,
+        max_length=None,
+        return_tensors: Optional[Union[str, TensorType]] = TensorType.PYTORCH,
     ) -> BatchFeature:
         """
         Main method to prepare for the model one or several sequences(s) and image(s). This method forwards the `text`
     def get_special_image_token_id(self):
         return self.tokenizer.convert_tokens_to_ids(self.special_image_token)
+    def _convert_images_texts_to_inputs(self, images, texts, padding=False, truncation=None, max_length=None, return_tensors=None):
         if not len(images):
+            model_inputs = self.tokenizer(texts, return_tensors=return_tensors, padding=padding, truncation=truncation, max_length=max_length)
             return BatchFeature(data={**model_inputs})
         pattern = r"<\|image_\d+\|>"