Instructions to use microsoft/Florence-2-large-ft with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use microsoft/Florence-2-large-ft with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="microsoft/Florence-2-large-ft", trust_remote_code=True)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("microsoft/Florence-2-large-ft", trust_remote_code=True)
model = AutoModelForMultimodalLM.from_pretrained("microsoft/Florence-2-large-ft", trust_remote_code=True)

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use microsoft/Florence-2-large-ft with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "microsoft/Florence-2-large-ft"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "microsoft/Florence-2-large-ft",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/microsoft/Florence-2-large-ft

SGLang

How to use microsoft/Florence-2-large-ft with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "microsoft/Florence-2-large-ft" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "microsoft/Florence-2-large-ft",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "microsoft/Florence-2-large-ft" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "microsoft/Florence-2-large-ft",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use microsoft/Florence-2-large-ft with Docker Model Runner:
```
docker model run hf.co/microsoft/Florence-2-large-ft
```

issue with 'Florence2ForConditionalGeneration' object has no attribute '_supports_sdpa'

#40

by intrsi - opened Jul 29, 2025

Discussion

intrsi

Jul 29, 2025

at the begining of the month i started working on finetuning florence 2 and with great success i managed, i had 2 notebooks used to finetune and to test inference.

sadly when i tried today it seems none of them work anymore because of this error
'Florence2ForConditionalGeneration' object has no attribute '_supports_sdpa'

happens always when i try this

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[5], line 8
      5 device = torch.device("mps" if torch.backends.mps.is_available() else "cpu")
      7 try:
----> 8     model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype='auto', trust_remote_code=True).eval().to(device)
      9     print("Model loaded successfully to MPS.")
     10     # Add a small dummy input to trigger generation (replace with actual input if needed)

File [/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py:593](https://qheip44eaodzmj-8888.proxy.runpod.net/lab/tree/workspace/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py#line=592), in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    591         model_class.register_for_auto_class(auto_class=cls)
    592     model_class = add_generation_mixin_to_remote_model(model_class)
--> 593     return model_class.from_pretrained(
    594         pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs
    595     )
    596 elif type(config) in cls._model_mapping.keys():
    597     model_class = _get_model_class(config, cls._model_mapping)

File [/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py:315](https://qheip44eaodzmj-8888.proxy.runpod.net/lab/tree/workspace/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py#line=314), in restore_default_torch_dtype.<locals>._wrapper(*args, **kwargs)
    313 old_dtype = torch.get_default_dtype()
    314 try:
--> 315     return func(*args, **kwargs)
    316 finally:
    317     torch.set_default_dtype(old_dtype)

File [/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py:4927](https://qheip44eaodzmj-8888.proxy.runpod.net/lab/tree/workspace/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py#line=4926), in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, config, cache_dir, ignore_mismatched_sizes, force_download, local_files_only, token, revision, use_safetensors, weights_only, *model_args, **kwargs)
   4924 config = copy.deepcopy(config)  # We do not want to modify the config inplace in from_pretrained.
   4925 with ContextManagers(model_init_context):
   4926     # Let's make sure we don't run the init function of buffer modules
-> 4927     model = cls(config, *model_args, **model_kwargs)
   4929 if _torch_distributed_available and device_mesh is not None:
   4930     model = distribute_model(model, distributed_config, device_mesh, tp_size)

File [~/.cache/huggingface/modules/transformers_modules/microsoft/Florence-2-base-ft/9803f52844ec1ae5df004e6089262e9a23e527fd/modeling_florence2.py:2534](https://qheip44eaodzmj-8888.proxy.runpod.net/lab/tree/workspace/~/.cache/huggingface/modules/transformers_modules/microsoft/Florence-2-base-ft/9803f52844ec1ae5df004e6089262e9a23e527fd/modeling_florence2.py#line=2533), in Florence2ForConditionalGeneration.__init__(self, config)
   2533 def __init__(self, config: Florence2Config):
-> 2534     super().__init__(config)
   2535     assert config.vision_config.model_type == 'davit', 'only DaViT is supported for now'
   2536     self.vision_tower = DaViT.from_config(config=config.vision_config)

File [/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py:2190](https://qheip44eaodzmj-8888.proxy.runpod.net/lab/tree/workspace/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py#line=2189), in PreTrainedModel.__init__(self, config, *inputs, **kwargs)
   2186 self.config = config
   2188 # Check the attention implementation is supported, or set it if not yet set (on the internal attr, to avoid
   2189 # setting it recursively)
-> 2190 self.config._attn_implementation_internal = self._check_and_adjust_attn_implementation(
   2191     self.config._attn_implementation, is_init_check=True
   2192 )
   2194 # for initialization of the loss
   2195 loss_type = self.__class__.__name__

File [/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py:2738](https://qheip44eaodzmj-8888.proxy.runpod.net/lab/tree/workspace/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py#line=2737), in PreTrainedModel._check_and_adjust_attn_implementation(self, attn_implementation, is_init_check)
   2735 elif applicable_attn_implementation == "sdpa":
   2736     # Sdpa is the default, so we try it and fallback to eager otherwise when not possible
   2737     try:
-> 2738         self._sdpa_can_dispatch(is_init_check)
   2739     except (ValueError, ImportError) as e:
   2740         # In this case, sdpa was requested explicitly, but we can't use it, so let's raise
   2741         if attn_implementation == "sdpa":

File [/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py:2602](https://qheip44eaodzmj-8888.proxy.runpod.net/lab/tree/workspace/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py#line=2601), in PreTrainedModel._sdpa_can_dispatch(self, is_init_check)
   2591 def _sdpa_can_dispatch(self, is_init_check: bool = False) -> bool:
   2592     """
   2593     Check the availability of SDPA for a given model.
   2594 
   (...)
   2600             before instantiating the full models if we know that the model does not support the requested attention.
   2601     """
-> 2602     if not self._supports_sdpa:
   2603         raise ValueError(
   2604             f"{self.__class__.__name__} does not support an attention implementation through torch.nn.functional.scaled_dot_product_attention yet."
   2605             " Please request the support for this architecture: https://github.com/huggingface/transformers/issues/28005. If you believe"
   2606             ' this error is a bug, please open an issue in Transformers GitHub repository and load your model with the argument `attn_implementation="eager"` meanwhile. Example: `model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="eager")`'
   2607         )
   2608     if not is_torch_sdpa_available():

File [/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py:1688](https://qheip44eaodzmj-8888.proxy.runpod.net/lab/tree/workspace/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py#line=1687), in Module.__getattr__(self, name)
   1686     if name in modules:
   1687         return modules[name]
-> 1688 raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")

AttributeError: 'Florence2ForConditionalGeneration' object has no attribute '_supports_sdpa'

AttributeError                            Traceback (most recent call last)
[/tmp/ipython-input-5-2911465734.py](https://localhost:8080/#) in <cell line: 0>()
      6 
      7 try:
----> 8     model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype='auto', trust_remote_code=True).eval().to(device)
      9     print("Model loaded successfully to MPS.")
     10     # Add a small dummy input to trigger generation (replace with actual input if needed)

7 frames
[/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in __getattr__(self, name)
   1938             if name in modules:
   1939                 return modules[name]
-> 1940         raise AttributeError(
   1941             f"'{type(self).__name__}' object has no attribute '{name}'"
   1942         )

AttributeError: 'Florence2ForConditionalGeneration' object has no attribute '_supports_sdpa'

happens even in the official inference example notebook

ponkin

Jul 30, 2025

•

edited Jul 30, 2025

@intrsi Check your transformers version package , must be transformers==4.49.0

intrsi

Jul 31, 2025

thanks bro

intrsi changed discussion status to closed Jul 31, 2025

joyyang1215

Aug 4, 2025

It worked when using transformers==4.49.0

bigfucr

Aug 6, 2025

god bless you

DreamBuilding

Aug 6, 2025

Thanks, that solved my issue too

Roboman28

Aug 6, 2025

Sorry how do we force transformers==4.49.0? I know how to use the update command but the latest issue of transformers is beyond 4.49, current 4.55 (Aug 2025). Can you force to an older version too?

sergeh

Aug 10, 2025

Sorry how do we force transformers==4.49.0? I know how to use the update command but the latest issue of transformers is beyond 4.49, current 4.55 (Aug 2025). Can you force to an older version too?

pip install transformers==4.49.0

It will uninstall current version and install 4.49.0
I can also confirms this fixes the issue

Roboman28

Aug 11, 2025

Thanks very much. I thought I had tried that unsuccessfully (maybe I did something wrong). From another site I found that you can go into the ComfyUI manager, go to the python box and type in transformers==4.49.0 but you have to set security = weak in the ComfyUI config.ini file too. That worked for me. The author of the Florence2 comfyUI adaptation said he got things to work at 4.54 (currently at 4.55 11th Aug 2025) but after I updated everything it did not work so in my case so stuck with the earlier edition of transformers, at least for now. Still I can use Qwen and Florence2 so happy. The trouble with the python environment is that updates can stop earlier items working, not necessarily backwards compatible, so the standard advice of update everything not necessarily good. I would say if things currently work leave alone and don't update.

fabono

Aug 11, 2025

Following your advice I installed transformers==4.49.0 but I faced another error:

RuntimeError: Failed to import transformers.models.auto.processing_auto because of the following error (look up to see its traceback):
/home/fabrice/miniconda3/envs/lerobot/lib/python3.10/site-packages/cv2/python-3.10/../../../.././libtiff.so.6: undefined symbol: jpeg12_write_raw_data, version LIBJPEG_8.

Do I have to upgrade another library?

Roboman28

Aug 11, 2025

fabono. I only had to do the 2 things I described and it worked so cannot help you. In my case no other library had to be upgraded but it might be worth mentioning that I upgraded everything comfyui and python had before I down graded the transformers.

vitoryyue

Aug 20, 2025

我也遇到了这样的问题，请问各位大神改如何操作呢

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment