Spaces:

MinhDS
/

Florence-2-Demo

Running

App Files Files Community

Florence-2-Demo / README.md

MinhDS

Update README.md

a56dfb0 verified 4 months ago

preview code

raw

history blame contribute delete

2.77 kB

	---
	title: Florence-2 Vision Tasks Demo
	emoji: 🚀
	colorFrom: green
	colorTo: blue
	sdk: gradio
	sdk_version: 5.39.0
	app_file: app.py
	pinned: true
	short_description: This is a Gradio-based demo showcasing Florence-2
	license: mit
	---

	# Florence-2 Demo: Advancing a Unified Representation for a Variety of Vision Tasks

	This is a Gradio-based demo showcasing Florence-2, a unified vision foundation model that advances the state-of-the-art in various computer vision tasks through a single, versatile architecture.

	## Demo Preview

	![Demo Screenshot](./image-demo.png)

	## About Florence-2

	Florence-2 represents a significant breakthrough in computer vision by providing a unified representation that can handle a diverse range of vision tasks including:

	- Object detection
	- Image captioning
	- Visual question answering
	- OCR (Optical Character Recognition)
	- Region proposal
	- Segmentation
	- And many more vision tasks

	The model demonstrates how a single architecture can be effectively applied across multiple vision domains, eliminating the need for task-specific models.

	## Paper & Resources

	📄 CVPR 2024 Paper: [Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks](https://openaccess.thecvf.com/content/CVPR2024/papers/Xiao_Florence-2_Advancing_a_Unified_Representation_for_a_Variety_of_Vision_CVPR_2024_paper.pdf)

	🎥 CVPR Virtual Presentation: [https://cvpr.thecvf.com/virtual/2024/poster/30529](https://cvpr.thecvf.com/virtual/2024/poster/30529)

	🖼️ Research Poster: [Poster.png](./Poster.png)

	## Demo Features

	This Gradio demo allows you to:
	- Upload images and interact with Florence-2's various capabilities
	- Test different vision tasks on your own images
	- Experience the unified model's performance across multiple domains

	## Getting Started

	1. Install the required dependencies:
	```bash
	pip install -r requirements.txt
	```

	2. Run the demo:
	```bash
	python app.py
	```

	3. Open your browser and navigate to the provided local URL to start using the demo.

	## References

	Hugging Face Spaces:
	- [Florence-2 Demo by gokaygokay](https://huggingface.co/spaces/gokaygokay/Florence-2)
	- [Florence-SAM Integration by SkalskiP](https://huggingface.co/spaces/SkalskiP/florence-sam)

	## Citation

	If you use this demo or find Florence-2 useful in your research, please cite:

	```bibtex
	@inproceedings{xiao2024florence,
	title={Florence-2: Advancing a unified representation for a variety of vision tasks},
	author={Xiao, Bin and Wu, Haiping and Xu, Weijian and Dai, Xiyang and Hu, Houdong and Lu, Yumao and Zeng, Michael and Liu, Ce and Yuan, Lu},
	booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
	pages={4818--4829},
	year={2024}
	}
	```