Spaces:
Runtime error
Runtime error
A newer version of the Gradio SDK is available: 6.12.0
metadata
title: DeepVision Prompt Builder
emoji: π―
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: mit
π― DeepVision Prompt Builder
AI-Powered Image & Video Analysis with Automatic JSON Prompt Generation
Overview
DeepVision is a modular AI system that analyzes images and videos to generate structured JSON prompts. Perfect for:
- πΈ Automated image tagging
- π¬ Video content analysis
- π€ AI training data preparation
- π Media cataloging
- π¨ Creative prompt generation
Features
Available Plugins
- π¨ Color Analyzer (Fast): Extract dominant colors, color schemes, brightness, and saturation
- π Object Detector (CLIP): Zero-shot object detection with confidence scores
- π¬ Caption Generator (BLIP-2): Natural language image descriptions
Supported Formats
- Images: JPG, PNG, WebP, BMP, GIF
- Videos: MP4, AVI, MOV, MKV
Usage
- Upload an image or video file
- Select which analysis plugins to use
- Click "Analyze" to process
- View results in formatted or JSON format
- Download JSON output for use in other systems
Performance Notes
- Color Analyzer: ~1-2 seconds per image, lightweight
- Object Detector: First use downloads ~2GB CLIP model, then ~5-10 seconds per image
- Caption Generator: First use downloads ~2-5GB BLIP-2 model, then ~8-15 seconds per image
- Video Analysis: Processes N keyframes (configurable 1-20 frames)
Example Output
{
"results": {
"color_analyzer": {
"dominant_colors": [
{"color": [45, 85, 125], "percentage": 35.2, "name": "blue"}
],
"color_scheme": "cool",
"average_brightness": 128.5,
"average_saturation": 0.65
}
},
"metadata": {
"file": {
"filename": "example.jpg",
"size_mb": 2.4,
"width": 1920,
"height": 1080
},
"processing": {
"duration_seconds": 1.234,
"plugins_used": ["color_analyzer"]
}
}
}
Technology Stack
- Framework: Python 3.10+
- UI: Gradio 4.44+
- CV: OpenCV, PIL, NumPy
- AI Models: CLIP, BLIP-2 (via HuggingFace Transformers)
- Logging: Loguru
Architecture
DeepVision uses a plugin-based architecture:
- Core Engine: Orchestrates analysis pipeline
- Plugin System: Modular, extensible analysis components
- Result Manager: Aggregates and formats outputs
Local Development
# Clone repository
git clone https://huggingface.co/spaces/YOUR_USERNAME/deepvision
cd deepvision
# Install dependencies
pip install -r requirements.txt
# Run locally
python app.py
License
MIT License - Free to use and modify
Credits
Built by AI Dev Collective v9.0
- Astro (Lead Developer)
- Lyra (Research)
- Nexus (Code Quality)
- CryptoX (Security)
- NOVA (UI/UX)
- Echo (Performance)
- Sage (Documentation)
- Pulse (DevOps)
Links
- π Full Documentation
- π Report Issues
- π‘ Feature Requests
Version: 0.1.0
Last Updated: January 2025