Spaces:
Runtime error
Runtime error
Tonic
commited on
improve gradio blocks interface
Browse files
app.py
CHANGED
|
@@ -12,12 +12,8 @@ import matplotlib.patches as patches
|
|
| 12 |
from matplotlib.patches import Polygon
|
| 13 |
import numpy as np
|
| 14 |
import random
|
| 15 |
-
import json
|
| 16 |
|
| 17 |
|
| 18 |
-
with open("config.json", "r") as f:
|
| 19 |
-
config = json.load(f)
|
| 20 |
-
|
| 21 |
d_model = config['text_config']['d_model']
|
| 22 |
num_layers = config['text_config']['encoder_layers']
|
| 23 |
attention_heads = config['text_config']['encoder_attention_heads']
|
|
@@ -32,10 +28,15 @@ temporal_embeddings = config['vision_config']['visual_temporal_embedding']['max_
|
|
| 32 |
|
| 33 |
title = """# 🙋🏻♂️Welcome to Tonic's PLeIAs/📸📈✍🏻Florence-PDF"""
|
| 34 |
description = """
|
| 35 |
-
---
|
| 36 |
-
|
| 37 |
This application showcases the **PLeIAs/📸📈✍🏻Florence-PDF** model, a powerful AI system designed for both **text and image generation tasks**. The model is capable of handling complex tasks such as object detection, image captioning, OCR (Optical Character Recognition), and detailed region-based image analysis.
|
| 38 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 39 |
### **How to Use**:
|
| 40 |
1. **Upload an Image**: Select an image for processing.
|
| 41 |
2. **Choose a Task**: Pick a task from the dropdown menu, such as "Caption", "Object Detection", "OCR", etc.
|
|
@@ -50,8 +51,6 @@ You can reset the interface anytime by clicking the **Reset** button.
|
|
| 50 |
- **📸✍🏻OCR**: Extract text from the image.
|
| 51 |
- **📸Region Proposal**: Detect key regions in the image for detailed captioning.
|
| 52 |
|
| 53 |
-
---
|
| 54 |
-
|
| 55 |
### Join us :
|
| 56 |
🌟TeamTonic🌟 is always making cool demos! Join our active builder's 🛠️community 👻 [](https://discord.gg/qdfnvSPcqP) On 🤗Huggingface:[MultiTransformer](https://huggingface.co/MultiTransformer) On 🌐Github: [Tonic-AI](https://github.com/tonic-ai) & contribute to🌟 [Build Tonic](https://git.tonic-ai.com/contribute)🤗Big thanks to Yuvi Sharma and all the folks at huggingface for the community grant 🤗
|
| 57 |
"""
|
|
@@ -77,12 +76,6 @@ In addition to text tasks, 🙏🏻PLeIAs/📸📈✍🏻Florence-PDF also incor
|
|
| 77 |
- **Patch-based Image Processing**: The vision component operates on image patches with a patch size of **{patch_size}x{patch_size}**.
|
| 78 |
- **Temporal Embedding**: Visual tasks benefit from temporal embeddings with up to **{temporal_embeddings} steps**, making Florence-2 well-suited for video analysis.
|
| 79 |
|
| 80 |
-
### Model Usage and Flexibility
|
| 81 |
-
|
| 82 |
-
- **No Repeat N-Grams**: To reduce repetition in text generation, the model is configured with a **no_repeat_ngram_size** of **{no_repeat_ngram_size}**, ensuring more diverse and meaningful outputs.
|
| 83 |
-
- **Sampling Strategies**: 🙏🏻PLeIAs/📸📈✍🏻Florence-PDF offers flexible sampling strategies, including **top-k** and **top-p (nucleus) sampling**, allowing for both creative and constrained generation based on user needs.
|
| 84 |
-
|
| 85 |
-
📸📈✍🏻Florence-PDF is a robust model capable of handling various **text and image** tasks with high precision and flexibility, making it a valuable tool for both academic research and practical applications.
|
| 86 |
"""
|
| 87 |
|
| 88 |
device = "cuda" if torch.cuda.is_available() else "cpu"
|
|
|
|
| 12 |
from matplotlib.patches import Polygon
|
| 13 |
import numpy as np
|
| 14 |
import random
|
|
|
|
| 15 |
|
| 16 |
|
|
|
|
|
|
|
|
|
|
| 17 |
d_model = config['text_config']['d_model']
|
| 18 |
num_layers = config['text_config']['encoder_layers']
|
| 19 |
attention_heads = config['text_config']['encoder_attention_heads']
|
|
|
|
| 28 |
|
| 29 |
title = """# 🙋🏻♂️Welcome to Tonic's PLeIAs/📸📈✍🏻Florence-PDF"""
|
| 30 |
description = """
|
|
|
|
|
|
|
| 31 |
This application showcases the **PLeIAs/📸📈✍🏻Florence-PDF** model, a powerful AI system designed for both **text and image generation tasks**. The model is capable of handling complex tasks such as object detection, image captioning, OCR (Optical Character Recognition), and detailed region-based image analysis.
|
| 32 |
|
| 33 |
+
### Model Usage and Flexibility
|
| 34 |
+
|
| 35 |
+
- **No Repeat N-Grams**: To reduce repetition in text generation, the model is configured with a **no_repeat_ngram_size** of **{no_repeat_ngram_size}**, ensuring more diverse and meaningful outputs.
|
| 36 |
+
- **Sampling Strategies**: 🙏🏻PLeIAs/📸📈✍🏻Florence-PDF offers flexible sampling strategies, including **top-k** and **top-p (nucleus) sampling**, allowing for both creative and constrained generation based on user needs.
|
| 37 |
+
|
| 38 |
+
📸📈✍🏻Florence-PDF is a robust model capable of handling various **text and image** tasks with high precision and flexibility, making it a valuable tool for both academic research and practical applications.
|
| 39 |
+
|
| 40 |
### **How to Use**:
|
| 41 |
1. **Upload an Image**: Select an image for processing.
|
| 42 |
2. **Choose a Task**: Pick a task from the dropdown menu, such as "Caption", "Object Detection", "OCR", etc.
|
|
|
|
| 51 |
- **📸✍🏻OCR**: Extract text from the image.
|
| 52 |
- **📸Region Proposal**: Detect key regions in the image for detailed captioning.
|
| 53 |
|
|
|
|
|
|
|
| 54 |
### Join us :
|
| 55 |
🌟TeamTonic🌟 is always making cool demos! Join our active builder's 🛠️community 👻 [](https://discord.gg/qdfnvSPcqP) On 🤗Huggingface:[MultiTransformer](https://huggingface.co/MultiTransformer) On 🌐Github: [Tonic-AI](https://github.com/tonic-ai) & contribute to🌟 [Build Tonic](https://git.tonic-ai.com/contribute)🤗Big thanks to Yuvi Sharma and all the folks at huggingface for the community grant 🤗
|
| 56 |
"""
|
|
|
|
| 76 |
- **Patch-based Image Processing**: The vision component operates on image patches with a patch size of **{patch_size}x{patch_size}**.
|
| 77 |
- **Temporal Embedding**: Visual tasks benefit from temporal embeddings with up to **{temporal_embeddings} steps**, making Florence-2 well-suited for video analysis.
|
| 78 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 79 |
"""
|
| 80 |
|
| 81 |
device = "cuda" if torch.cuda.is_available() else "cpu"
|