Instructions to use florence-community/Florence-2-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use florence-community/Florence-2-base with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="florence-community/Florence-2-base")

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("florence-community/Florence-2-base")
model = AutoModelForMultimodalLM.from_pretrained("florence-community/Florence-2-base")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use florence-community/Florence-2-base with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "florence-community/Florence-2-base"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "florence-community/Florence-2-base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/florence-community/Florence-2-base

SGLang

How to use florence-community/Florence-2-base with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "florence-community/Florence-2-base" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "florence-community/Florence-2-base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "florence-community/Florence-2-base" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "florence-community/Florence-2-base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use florence-community/Florence-2-base with Docker Model Runner:
```
docker model run hf.co/florence-community/Florence-2-base
```

How were the models converted?

by xzuyn - opened Oct 18, 2025

Discussion

xzuyn

Oct 18, 2025

I've got some finetuned models that used the original model format, and they don't work on the latest transformers, but these do. I tried to rename keys and such but couldn't get it working.

ducviet00

Florence-2 Community org Oct 19, 2025

Hey, here’s the script you can use to convert the model:
https://github.com/huggingface/transformers/tree/main/src/transformers/models/florence2

Example usage:

python convert_florence2_original_pytorch_to_hf.py
--hf_model_id microsoft/Florence-2-large
--pytorch_dump_folder_path florence2-large
--output_hub_path ducviet00/Florence-2-large-hf

ducviet00 changed discussion status to closed Oct 20, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment