Spaces:
Sleeping
Sleeping
| title: Florence-2 Document & Image Analyzer | |
| emoji: ๐ | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: gradio | |
| app_file: app.py | |
| pinned: false | |
| license: apache-2.0 | |
| short_description: Analyze images and PDFs with Florence-2 vision model | |
| tags: | |
| - computer-vision | |
| - florence-2 | |
| - document-analysis | |
| - pdf-processing | |
| - image-analysis | |
| - object-detection | |
| # Florence-2 Document & Image Analyzer | |
| An interactive Hugging Face Space that uses Microsoft's Florence-2 vision model to analyze uploaded images and PDF documents. The application provides comprehensive visual analysis with bounding box overlays, object detection, and detailed captions. | |
| ## Features | |
| - **Multi-format Support**: Upload PNG, JPG, JPEG images or PDF documents | |
| - **PDF Processing**: Automatically converts PDF pages to images for analysis | |
| - **Florence-2 Integration**: Uses the powerful Florence-2 model for: | |
| - Object detection with bounding boxes | |
| - Dense captioning | |
| - OCR text detection | |
| - Visual question answering | |
| - **Interactive Overlays**: View original and annotated versions side-by-side | |
| - **Batch Processing**: Handle multi-page PDFs efficiently | |
| - **User-Friendly Interface**: Clean Gradio interface with clear instructions | |
| ## How to Use | |
| 1. **Upload a file**: Choose an image (PNG/JPG/JPEG) or PDF document | |
| 2. **Select analysis type**: Choose from various Florence-2 tasks | |
| 3. **View results**: See original and annotated versions with overlays | |
| 4. **Download results**: Save processed images with annotations | |
| ## Model Information | |
| This Space uses Microsoft's Florence-2 model, a foundation vision model that can handle various computer vision and vision-language tasks with a single model architecture. | |
| ## Technical Details | |
| - **Framework**: Gradio 4.44.0 | |
| - **Model**: Microsoft Florence-2 (microsoft/Florence-2-large) | |
| - **PDF Processing**: pdf2image for page-by-page conversion | |
| - **Visualization**: PIL and OpenCV for overlay rendering | |
| - **Hardware**: Optimized for CPU and GPU inference | |
| ## Examples | |
| Upload any document or image to see Florence-2 in action: | |
| - **Documents**: Analyze layouts, detect text regions, identify tables | |
| - **Photos**: Object detection, scene understanding, detailed captions | |
| - **Screenshots**: UI element detection, text extraction | |
| - **Technical diagrams**: Component identification and labeling | |
| # Florence-2 Document & Image Analyzer | |
| This Space uses Gradio to provide an interactive interface for Microsoft's Florence-2 vision model. | |
| ## Features | |
| - Object Detection with bounding boxes | |
| - Detailed image captioning | |
| - OCR text extraction | |
| - Interactive Gradio interface | |
| - Model caching for performance | |
| Upload an image and select an analysis type to get started! | |