Instructions to use microsoft/Florence-2-large with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use microsoft/Florence-2-large with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="microsoft/Florence-2-large", trust_remote_code=True)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("microsoft/Florence-2-large", trust_remote_code=True)
model = AutoModelForMultimodalLM.from_pretrained("microsoft/Florence-2-large", trust_remote_code=True, device_map="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use microsoft/Florence-2-large with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "microsoft/Florence-2-large"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "microsoft/Florence-2-large",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/microsoft/Florence-2-large

SGLang

How to use microsoft/Florence-2-large with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "microsoft/Florence-2-large" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "microsoft/Florence-2-large",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "microsoft/Florence-2-large" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "microsoft/Florence-2-large",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use microsoft/Florence-2-large with Docker Model Runner:
```
docker model run hf.co/microsoft/Florence-2-large
```

Fine-tuning for multiple tasks strategy

#32

by gennarino80 - opened Jun 28, 2024

Discussion

gennarino80

Jun 28, 2024

I would like to fine-tune this model on a specific set of images and combining 2 different tasks (used in cascade).

The idea is that once received the input image, the model should perform the image captioning task (MORE_DETAILED_CAPTION) to describe the image, and then use the CAPTION_TO_PHRASE_GROUNDING in order to have a 'visual perspective' of what the model has described (a sort of gradcam of the text).

What should I do in this case? Fine tune the model twice, starting from the image captioning task and then use the obtained model to train the model for the second task?

anhdang000

Oct 11, 2024

Same here, I am working on Chart Question Answering and would like to fine-tune this model on multitask (Visual Question Answering and Object Detection). Of course that I don't want to fine-tune the model twice.

Have you found a way to do that?

siddiqueMLS

Oct 29, 2024

I am also looking for multi-task learning strategy. My understanding from the paper is that we need to design datasets for multi-task prediction from the single loss-based florence2.

amirhosein2c

Jan 30, 2025

Any answer for these questions? I truly appreciate it if someone could give me a hint on what is the best way to finetune Florence-2 on multiple tasks.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment