Instructions to use microsoft/Florence-2-large with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use microsoft/Florence-2-large with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="microsoft/Florence-2-large", trust_remote_code=True)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("microsoft/Florence-2-large", trust_remote_code=True)
model = AutoModelForMultimodalLM.from_pretrained("microsoft/Florence-2-large", trust_remote_code=True)

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use microsoft/Florence-2-large with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "microsoft/Florence-2-large"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "microsoft/Florence-2-large",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/microsoft/Florence-2-large

SGLang

How to use microsoft/Florence-2-large with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "microsoft/Florence-2-large" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "microsoft/Florence-2-large",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "microsoft/Florence-2-large" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "microsoft/Florence-2-large",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use microsoft/Florence-2-large with Docker Model Runner:
```
docker model run hf.co/microsoft/Florence-2-large
```

Which image format is preferred? Error during inference

#26

by skye0402 - opened Jun 26, 2024

Discussion

skye0402

Jun 26, 2024

It seems some image formats are not working well (e.g. PNG)

  File "/code/.local/lib/python3.10/site-packages/transformers/models/clip/image_processing_clip.py", line 320, in preprocess
  input_data_format = infer_channel_dimension_format(images[0])
  File "/code/.local/lib/python3.10/site-packages/transformers/image_utils.py", line 209, in infer_channel_dimension_format
    raise ValueError("Unable to infer channel dimension format")

Is there a spec somewhere which defines which format is preferred by the model?

QiuQiuShouLing

Jun 27, 2024

欸？是推理代码吗？推理代码我

传png文件可以正常使用啊

haipingwu

Jun 29, 2024

•

edited Jun 29, 2024

hi, can you check if your image mode is color-channeled (BGR/RGB).

skye0402

Jun 29, 2024

@haipingwu can you check this attached png? It was one that didn't work for me.

kjraym

Jul 4, 2024

It doesn't like the alpha channel in the PNG. Convert it like this:

from PIL import Image
image_from_url = Image.open(requests.get(url, stream=True).raw).convert("RGB")
image_from_file = Image.open(file).convert("RGB")

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment