Update README.md

6d8a836 verified 1 day ago

4.19 kB

	---
	library_name: diffusers
	---

	# Florence-2 Image Annotator

	A custom [Modular Diffusers](https://huggingface.co/docs/diffusers/modular_diffusers/overview) block that uses [Florence-2](https://huggingface.co/docs/transformers/model_doc/florence2) for image annotation tasks like segmentation, object detection, and captioning.

	## Usage

	### Basic Usage

	```python
	import torch
	from diffusers import ModularPipeline
	from diffusers.utils import load_image

	# Load the block
	image_annotator = ModularPipeline.from_pretrained(
	"diffusers/Florence2-image-Annotator",
	trust_remote_code=True
	)
	image_annotator.load_components(torch_dtype=torch.bfloat16)
	image_annotator.to("cuda")

	# Load an image
	image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg")
	image = image.resize((1024, 1024))

	# Generate a segmentation mask
	output = image_annotator(
	image=image,
	annotation_task="<REFERRING_EXPRESSION_SEGMENTATION>",
	annotation_prompt="the car",
	annotation_output_type="mask_image",
	)
	output.mask_image[0].save("car-mask.png")
	```

	### Compose with Inpainting Pipeline

	```python
	from diffusers import ModularPipeline

	# Load the annotator
	image_annotator = ModularPipeline.from_pretrained(
	"diffusers/Florence2-image-Annotator",
	trust_remote_code=True
	)

	# Get an inpainting workflow and insert the annotator
	# repo_id = .. # you can use SDXL/flux/qwen any pipeline support Inpaint
	inpaint_blocks = ModularPipeline.from_pretrained(repo_id).blocks.get_workflow("inpainting")
	inpaint_blocks.sub_blocks.insert("image_annotator", image_annotator.blocks, 0)

	# Initialize the combined pipeline
	pipe = inpaint_blocks.init_pipeline()
	pipe.load_components(torch_dtype=torch.float16, device="cuda")

	# Inpaint with automatic mask generation
	output = pipe(
	prompt=prompt,
	image=image,
	annotation_task="<REFERRING_EXPRESSION_SEGMENTATION>",
	annotation_prompt="the car",
	annotation_output_type="mask_image",
	num_inference_steps=30,
	output="images"
	)
	output[0].save("inpainted-car.png")
	```

	## Supported Tasks

	\| Task \| Description \|
	\|------\|-------------\|
	\| `<OD>` \| Object detection \|
	\| `<REFERRING_EXPRESSION_SEGMENTATION>` \| Segment specific objects based on text \|
	\| `<CAPTION>` \| Generate image caption \|
	\| `<DETAILED_CAPTION>` \| Generate detailed caption \|
	\| `<MORE_DETAILED_CAPTION>` \| Generate very detailed caption \|
	\| `<DENSE_REGION_CAPTION>` \| Caption different regions \|
	\| `<CAPTION_TO_PHRASE_GROUNDING>` \| Ground phrases to regions \|
	\| `<OPEN_VOCABULARY_DETECTION>` \| Detect objects from open vocabulary \|

	## Output Types

	\| Type \| Description \|
	\|------\|-------------\|
	\| `mask_image` \| Black and white mask image \|
	\| `mask_overlay` \| Mask overlaid on original image \|
	\| `bounding_box` \| Bounding boxes drawn on image \|

	## Inputs

	\| Parameter \| Type \| Required \| Default \| Description \|
	\|-----------\|------\|----------\|---------\|-------------\|
	\| `image` \| `PIL.Image` \| Yes \| - \| Image to annotate \|
	\| `annotation_task` \| `str` \| No \| `<REFERRING_EXPRESSION_SEGMENTATION>` \| Task to perform \|
	\| `annotation_prompt` \| `str` \| Yes \| - \| Text prompt for the task \|
	\| `annotation_output_type` \| `str` \| No \| `mask_image` \| Output format \|

	## Outputs

	\| Parameter \| Type \| Description \|
	\|-----------\|------\|-------------\|
	\| `mask_image` \| `PIL.Image` \| Generated mask (when output type is `mask_image`) \|
	\| `image` \| `PIL.Image` \| Annotated image (when output type is `mask_overlay` or `bounding_box`) \|
	\| `annotations` \| `dict` \| Raw annotation predictions \|

	## Components

	This block uses the following models from [florence-community/Florence-2-base-ft](https://huggingface.co/florence-community/Florence-2-base-ft):

	- `image_annotator`: `Florence2ForConditionalGeneration`
	- `image_annotator_processor`: `AutoProcessor`

	## Learn More

	- [Building Custom Blocks Guide](https://huggingface.co/docs/diffusers/modular_diffusers/custom_blocks)
	- [Modular Diffusers Overview](https://huggingface.co/docs/diffusers/modular_diffusers/overview)
	- [Modular Diffusers Custom Blocks Collection](https://huggingface.co/collections/diffusers/modular-diffusers-custom-blocks)