Salman Abjam commited on
Commit
eb5a9e1
·
0 Parent(s):

Initial deployment: DeepVision Prompt Builder v0.1.0

Browse files
.gitignore ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # .gitignore for Hugging Face Space
2
+
3
+ # Python
4
+ __pycache__/
5
+ *.py[cod]
6
+ *$py.class
7
+ *.so
8
+ .Python
9
+ env/
10
+ venv/
11
+ ENV/
12
+
13
+ # Gradio cache
14
+ flagged/
15
+ gradio_cached_examples/
16
+
17
+ # Models cache
18
+ .cache/
19
+ models/
20
+
21
+ # Temporary files
22
+ *.tmp
23
+ *.log
24
+ test_results.json
25
+
26
+ # System files
27
+ .DS_Store
28
+ Thumbs.db
29
+
30
+ # IDE
31
+ .vscode/
32
+ .idea/
33
+ *.swp
34
+ *.swo
DEPLOYMENT_GUIDE.md ADDED
@@ -0,0 +1,204 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🚀 Hugging Face Spaces Deployment Guide
2
+
3
+ ## Prerequisites
4
+
5
+ 1. **Hugging Face Account**
6
+ - Create account at: https://huggingface.co/join
7
+ - Verify your email
8
+
9
+ 2. **Git LFS** (for large model files)
10
+ ```bash
11
+ git lfs install
12
+ ```
13
+
14
+ ---
15
+
16
+ ## Step 1: Create New Space
17
+
18
+ 1. Go to: https://huggingface.co/new-space
19
+ 2. Fill in details:
20
+ - **Name**: `deepvision-prompt-builder`
21
+ - **License**: MIT
22
+ - **SDK**: Gradio
23
+ - **SDK Version**: 4.44.0
24
+ - **Hardware**: CPU Basic (free) or GPU (paid)
25
+ - **Visibility**: Public or Private
26
+
27
+ 3. Click "Create Space"
28
+
29
+ ---
30
+
31
+ ## Step 2: Clone and Setup
32
+
33
+ ```bash
34
+ # Clone your space
35
+ git clone https://huggingface.co/spaces/YOUR_USERNAME/deepvision-prompt-builder
36
+ cd deepvision-prompt-builder
37
+
38
+ # Copy files from this directory
39
+ cp -r /path/to/huggingface_space/* .
40
+
41
+ # Add all files
42
+ git add .
43
+
44
+ # Commit
45
+ git commit -m "Initial deployment of DeepVision v0.1.0"
46
+
47
+ # Push to Hugging Face
48
+ git push
49
+ ```
50
+
51
+ ---
52
+
53
+ ## Step 3: Directory Structure
54
+
55
+ Your Space should have this structure:
56
+
57
+ ```
58
+ deepvision-prompt-builder/
59
+ ├── app.py # Main Gradio application
60
+ ├── README.md # Space description (auto-displayed)
61
+ ├── requirements.txt # Python dependencies
62
+ ├── .gitignore # Git ignore rules
63
+ ├── core/ # Core engine
64
+ │ ├── __init__.py
65
+ │ ├── engine.py
66
+ │ ├── image_processor.py
67
+ │ ├── video_processor.py
68
+ │ ├── result_manager.py
69
+ │ ├── config.py
70
+ │ ├── exceptions.py
71
+ │ └── logging_config.py
72
+ └── plugins/ # Plugin system
73
+ ├── __init__.py
74
+ ├── base.py
75
+ ├── loader.py
76
+ ├── color_analyzer.py
77
+ ├── object_detector.py
78
+ └── caption_generator.py
79
+ ```
80
+
81
+ ---
82
+
83
+ ## Step 4: Test Locally First
84
+
85
+ Before deploying, test locally:
86
+
87
+ ```bash
88
+ # Install dependencies
89
+ pip install -r requirements.txt
90
+
91
+ # Run the app
92
+ python app.py
93
+ ```
94
+
95
+ Open browser at: http://localhost:7860
96
+
97
+ Test with a sample image to ensure everything works.
98
+
99
+ ---
100
+
101
+ ## Step 5: Monitor Deployment
102
+
103
+ After pushing:
104
+ 1. Go to your Space URL: `https://huggingface.co/spaces/YOUR_USERNAME/deepvision-prompt-builder`
105
+ 2. Watch the build logs in the "Logs" tab
106
+ 3. Wait for "Running" status (green)
107
+ 4. Test the live app
108
+
109
+ ---
110
+
111
+ ## Step 6: Configure Settings (Optional)
112
+
113
+ ### Enable GPU (Paid)
114
+ 1. Go to Space Settings
115
+ 2. Select "GPU" hardware
116
+ 3. Choose: T4 small ($0.60/hour) or A10G large ($3.15/hour)
117
+ 4. Click "Save"
118
+
119
+ ### Set Environment Variables
120
+ ```bash
121
+ # In Space Settings → Environment Variables
122
+ HF_HOME=/data/.cache
123
+ TRANSFORMERS_CACHE=/data/.cache
124
+ ```
125
+
126
+ ---
127
+
128
+ ## Troubleshooting
129
+
130
+ ### Build Fails
131
+ - Check requirements.txt for typos
132
+ - Verify all imports are correct
133
+ - Check logs for specific errors
134
+
135
+ ### Out of Memory
136
+ - Reduce number of frames for video
137
+ - Disable heavy plugins (Object Detector, Caption Generator)
138
+ - Upgrade to GPU hardware
139
+
140
+ ### Slow Performance
141
+ - First run downloads models (~2-5GB) - this is normal
142
+ - Subsequent runs use cached models
143
+ - Consider GPU upgrade for production use
144
+
145
+ ---
146
+
147
+ ## Cost Optimization
148
+
149
+ ### Free Tier (CPU Basic)
150
+ - ✅ Color Analyzer works great
151
+ - ⚠️ Object Detector & Caption Generator are slow
152
+ - 📊 Suitable for demos and light usage
153
+
154
+ ### Paid Tier (GPU)
155
+ - 💰 T4 Small: $0.60/hour (~$432/month if always on)
156
+ - 💰 A10G Large: $3.15/hour (~$2,268/month if always on)
157
+ - 💡 Use "Pause Space" feature when not in use to save costs
158
+
159
+ ---
160
+
161
+ ## Update Space
162
+
163
+ ```bash
164
+ # Make changes locally
165
+ # Test locally
166
+
167
+ # Commit and push
168
+ git add .
169
+ git commit -m "Update: description of changes"
170
+ git push
171
+
172
+ # Space will auto-rebuild
173
+ ```
174
+
175
+ ---
176
+
177
+ ## Custom Domain (Optional)
178
+
179
+ Hugging Face Spaces provides:
180
+ - Default URL: `YOUR_USERNAME-deepvision-prompt-builder.hf.space`
181
+ - You can add custom domain in Space Settings
182
+
183
+ ---
184
+
185
+ ## Next Steps
186
+
187
+ After successful deployment:
188
+ 1. ✅ Share your Space URL
189
+ 2. 📝 Write a blog post announcement
190
+ 3. 🎥 Create demo video
191
+ 4. 📊 Monitor usage analytics
192
+ 5. 🐛 Collect user feedback
193
+
194
+ ---
195
+
196
+ ## Resources
197
+
198
+ - 📚 [Hugging Face Spaces Documentation](https://huggingface.co/docs/hub/spaces)
199
+ - 🎨 [Gradio Documentation](https://gradio.app/docs)
200
+ - 💬 [Community Forum](https://discuss.huggingface.co/)
201
+
202
+ ---
203
+
204
+ **Ready to deploy! 🚀**
DEPLOY_NOW.md ADDED
@@ -0,0 +1,201 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🚀 Hugging Face Spaces Deployment Instructions
2
+
3
+ ## ✅ Prerequisites Completed:
4
+ - ✅ Git repository initialized
5
+ - ✅ All code files ready
6
+ - ✅ Local testing passed
7
+
8
+ ---
9
+
10
+ ## 📋 Step-by-Step Deployment Guide
11
+
12
+ ### Step 1: Create Hugging Face Account (if needed)
13
+ 1. Go to: https://huggingface.co/join
14
+ 2. Sign up with email or GitHub
15
+ 3. Verify your email address
16
+
17
+ ---
18
+
19
+ ### Step 2: Create New Space
20
+
21
+ 1. **Go to**: https://huggingface.co/new-space
22
+
23
+ 2. **Fill in the details**:
24
+ ```
25
+ Owner: YOUR_USERNAME
26
+ Space name: deepvision-prompt-builder
27
+ License: MIT
28
+ Select SDK: Gradio
29
+ SDK Version: 4.44.0
30
+ Space hardware: CPU basic (free)
31
+ Visibility: Public
32
+ ```
33
+
34
+ 3. **Click**: "Create Space"
35
+
36
+ ---
37
+
38
+ ### Step 3: Clone Your New Space
39
+
40
+ ```powershell
41
+ # Navigate to parent directory
42
+ cd "E:\Ai\Projects\BRAINixIDEX\ThinkTank DVP"
43
+
44
+ # Clone your space (replace YOUR_USERNAME)
45
+ git clone https://huggingface.co/spaces/YOUR_USERNAME/deepvision-prompt-builder
46
+ ```
47
+
48
+ ---
49
+
50
+ ### Step 4: Copy Files to Cloned Space
51
+
52
+ ```powershell
53
+ # Copy all files from huggingface_space to cloned space
54
+ Copy-Item -Path "huggingface_space\*" -Destination "deepvision-prompt-builder\" -Recurse -Force
55
+
56
+ # Navigate to the cloned space
57
+ cd deepvision-prompt-builder
58
+ ```
59
+
60
+ ---
61
+
62
+ ### Step 5: Configure Git and Push
63
+
64
+ ```powershell
65
+ # Add all files
66
+ git add .
67
+
68
+ # Commit
69
+ git commit -m "Initial deployment: DeepVision Prompt Builder v0.1.0
70
+
71
+ Features:
72
+ - Gradio web interface
73
+ - Color Analyzer plugin (fast)
74
+ - Image and video support
75
+ - JSON output
76
+ - Real-time analysis"
77
+
78
+ # Push to Hugging Face
79
+ git push
80
+ ```
81
+
82
+ **Note**: You'll be prompted for Hugging Face credentials:
83
+ - Username: your HF username
84
+ - Password: use a **HF Access Token** (not your password)
85
+
86
+ ---
87
+
88
+ ### Step 6: Get Hugging Face Access Token
89
+
90
+ 1. Go to: https://huggingface.co/settings/tokens
91
+ 2. Click "New token"
92
+ 3. Name: `DeepVision Deploy`
93
+ 4. Role: `write`
94
+ 5. Click "Generate token"
95
+ 6. **Copy the token** (you'll need it for git push)
96
+
97
+ ---
98
+
99
+ ### Step 7: Monitor Deployment
100
+
101
+ 1. Go to your Space: `https://huggingface.co/spaces/YOUR_USERNAME/deepvision-prompt-builder`
102
+ 2. Click on "Logs" tab
103
+ 3. Wait for build to complete (usually 2-5 minutes)
104
+ 4. Status will change to "Running" (green)
105
+
106
+ ---
107
+
108
+ ### Step 8: Test Your Deployed App
109
+
110
+ 1. Open your Space URL
111
+ 2. Upload a test image
112
+ 3. Enable Color Analyzer
113
+ 4. Click Analyze
114
+ 5. Verify results
115
+
116
+ ---
117
+
118
+ ## 🎉 Alternative: Quick Deploy (Manual Upload)
119
+
120
+ If you don't want to use Git:
121
+
122
+ 1. Create Space on Hugging Face
123
+ 2. Click "Files" tab
124
+ 3. Click "Add file" → "Upload files"
125
+ 4. Drag and drop ALL files from `huggingface_space/` folder
126
+ 5. Wait for upload to complete
127
+ 6. Space will auto-rebuild
128
+
129
+ ---
130
+
131
+ ## 📊 Expected Results
132
+
133
+ **Build Time**: 2-5 minutes
134
+ **Startup Time**: 10-30 seconds
135
+ **URL**: `https://YOUR_USERNAME-deepvision-prompt-builder.hf.space`
136
+
137
+ ---
138
+
139
+ ## ⚙️ Configuration Options
140
+
141
+ ### Enable GPU (Optional - Paid)
142
+
143
+ 1. Go to Space Settings
144
+ 2. Change Hardware to:
145
+ - `CPU basic` (free) ✅ Recommended for demo
146
+ - `T4 small` ($0.60/hour) - For faster ML plugins
147
+ - `A10G large` ($3.15/hour) - For heavy usage
148
+
149
+ ### Set as Private
150
+
151
+ 1. Go to Space Settings
152
+ 2. Change Visibility to "Private"
153
+ 3. Only you can access it
154
+
155
+ ---
156
+
157
+ ## 🐛 Troubleshooting
158
+
159
+ ### Build Fails
160
+ - Check logs for errors
161
+ - Verify all files are uploaded
162
+ - Check requirements.txt syntax
163
+
164
+ ### App doesn't start
165
+ - Check if port 7860 is used (default for HF)
166
+ - Verify app.py has no syntax errors
167
+ - Check logs for Python errors
168
+
169
+ ### Slow performance
170
+ - Normal on free CPU tier
171
+ - ML plugins (Object Detector, Caption) will be slow
172
+ - Color Analyzer should work fast
173
+
174
+ ---
175
+
176
+ ## 📝 Post-Deployment
177
+
178
+ After successful deployment:
179
+
180
+ 1. ✅ Test with multiple images
181
+ 2. ✅ Share the URL
182
+ 3. ✅ Update README.md with live demo link
183
+ 4. ✅ Monitor usage analytics
184
+ 5. ✅ Collect user feedback
185
+
186
+ ---
187
+
188
+ ## 🔗 Useful Links
189
+
190
+ - **Your Space**: https://huggingface.co/spaces/YOUR_USERNAME/deepvision-prompt-builder
191
+ - **HF Docs**: https://huggingface.co/docs/hub/spaces
192
+ - **Gradio Docs**: https://gradio.app/docs
193
+ - **Support**: https://discuss.huggingface.co/
194
+
195
+ ---
196
+
197
+ **Ready to deploy! 🚀**
198
+
199
+ Choose your method:
200
+ 1. **Git Method** (recommended): Follow Steps 1-8
201
+ 2. **Manual Upload**: Use Alternative method
README.md ADDED
@@ -0,0 +1,138 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: DeepVision Prompt Builder
3
+ emoji: 🎯
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: gradio
7
+ sdk_version: 4.44.0
8
+ app_file: app.py
9
+ pinned: false
10
+ license: mit
11
+ python_version: 3.10
12
+ ---
13
+
14
+ # 🎯 DeepVision Prompt Builder
15
+
16
+ **AI-Powered Image & Video Analysis with Automatic JSON Prompt Generation**
17
+
18
+ ## Overview
19
+
20
+ DeepVision is a modular AI system that analyzes images and videos to generate structured JSON prompts. Perfect for:
21
+ - 📸 Automated image tagging
22
+ - 🎬 Video content analysis
23
+ - 🤖 AI training data preparation
24
+ - 📊 Media cataloging
25
+ - 🎨 Creative prompt generation
26
+
27
+ ## Features
28
+
29
+ ### Available Plugins
30
+
31
+ - **🎨 Color Analyzer** (Fast): Extract dominant colors, color schemes, brightness, and saturation
32
+ - **🔍 Object Detector** (CLIP): Zero-shot object detection with confidence scores
33
+ - **💬 Caption Generator** (BLIP-2): Natural language image descriptions
34
+
35
+ ### Supported Formats
36
+
37
+ - **Images**: JPG, PNG, WebP, BMP, GIF
38
+ - **Videos**: MP4, AVI, MOV, MKV
39
+
40
+ ## Usage
41
+
42
+ 1. Upload an image or video file
43
+ 2. Select which analysis plugins to use
44
+ 3. Click "Analyze" to process
45
+ 4. View results in formatted or JSON format
46
+ 5. Download JSON output for use in other systems
47
+
48
+ ## Performance Notes
49
+
50
+ - **Color Analyzer**: ~1-2 seconds per image, lightweight
51
+ - **Object Detector**: First use downloads ~2GB CLIP model, then ~5-10 seconds per image
52
+ - **Caption Generator**: First use downloads ~2-5GB BLIP-2 model, then ~8-15 seconds per image
53
+ - **Video Analysis**: Processes N keyframes (configurable 1-20 frames)
54
+
55
+ ## Example Output
56
+
57
+ ```json
58
+ {
59
+ "results": {
60
+ "color_analyzer": {
61
+ "dominant_colors": [
62
+ {"color": [45, 85, 125], "percentage": 35.2, "name": "blue"}
63
+ ],
64
+ "color_scheme": "cool",
65
+ "average_brightness": 128.5,
66
+ "average_saturation": 0.65
67
+ }
68
+ },
69
+ "metadata": {
70
+ "file": {
71
+ "filename": "example.jpg",
72
+ "size_mb": 2.4,
73
+ "width": 1920,
74
+ "height": 1080
75
+ },
76
+ "processing": {
77
+ "duration_seconds": 1.234,
78
+ "plugins_used": ["color_analyzer"]
79
+ }
80
+ }
81
+ }
82
+ ```
83
+
84
+ ## Technology Stack
85
+
86
+ - **Framework**: Python 3.10+
87
+ - **UI**: Gradio 4.44+
88
+ - **CV**: OpenCV, PIL, NumPy
89
+ - **AI Models**: CLIP, BLIP-2 (via HuggingFace Transformers)
90
+ - **Logging**: Loguru
91
+
92
+ ## Architecture
93
+
94
+ DeepVision uses a plugin-based architecture:
95
+ - **Core Engine**: Orchestrates analysis pipeline
96
+ - **Plugin System**: Modular, extensible analysis components
97
+ - **Result Manager**: Aggregates and formats outputs
98
+
99
+ ## Local Development
100
+
101
+ ```bash
102
+ # Clone repository
103
+ git clone https://huggingface.co/spaces/YOUR_USERNAME/deepvision
104
+ cd deepvision
105
+
106
+ # Install dependencies
107
+ pip install -r requirements.txt
108
+
109
+ # Run locally
110
+ python app.py
111
+ ```
112
+
113
+ ## License
114
+
115
+ MIT License - Free to use and modify
116
+
117
+ ## Credits
118
+
119
+ **Built by AI Dev Collective v9.0**
120
+ - Astro (Lead Developer)
121
+ - Lyra (Research)
122
+ - Nexus (Code Quality)
123
+ - CryptoX (Security)
124
+ - NOVA (UI/UX)
125
+ - Echo (Performance)
126
+ - Sage (Documentation)
127
+ - Pulse (DevOps)
128
+
129
+ ## Links
130
+
131
+ - 📚 [Full Documentation](https://github.com/yourusername/deepvision)
132
+ - 🐛 [Report Issues](https://github.com/yourusername/deepvision/issues)
133
+ - 💡 [Feature Requests](https://github.com/yourusername/deepvision/discussions)
134
+
135
+ ---
136
+
137
+ **Version**: 0.1.0
138
+ **Last Updated**: January 2025
app.py ADDED
@@ -0,0 +1,349 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ DeepVision Prompt Builder - Gradio Interface
3
+ Hugging Face Spaces Deployment
4
+
5
+ This is the main Gradio application for the DeepVision Prompt Builder.
6
+ It provides a web interface for uploading images/videos and viewing analysis results.
7
+ """
8
+
9
+ import gradio as gr
10
+ import json
11
+ from pathlib import Path
12
+ from typing import Dict, Any, Tuple
13
+ import tempfile
14
+
15
+ # Import core components
16
+ from core.engine import AnalysisEngine
17
+ from plugins.loader import PluginLoader
18
+ from core.logging_config import setup_logging
19
+ from loguru import logger
20
+
21
+ # Setup logging
22
+ setup_logging()
23
+
24
+
25
+ class DeepVisionGradioApp:
26
+ """
27
+ Gradio web interface for DeepVision Prompt Builder
28
+ """
29
+
30
+ def __init__(self):
31
+ """Initialize the Gradio app"""
32
+ self.engine = AnalysisEngine()
33
+ self.plugin_loader = PluginLoader()
34
+ self.setup_plugins()
35
+ logger.info("DeepVision Gradio App initialized")
36
+
37
+ def setup_plugins(self):
38
+ """Load and register all available plugins"""
39
+ try:
40
+ # Load all plugins
41
+ plugins = self.plugin_loader.load_all_plugins()
42
+
43
+ # Register plugins with engine
44
+ for name, plugin in plugins.items():
45
+ self.engine.register_plugin(name, plugin)
46
+ logger.info(f"Plugin registered: {name}")
47
+
48
+ logger.success(f"Loaded {len(plugins)} plugins successfully")
49
+ except Exception as e:
50
+ logger.error(f"Error loading plugins: {e}")
51
+
52
+ def analyze_media(
53
+ self,
54
+ file_path: str,
55
+ use_color_analyzer: bool = True,
56
+ use_object_detector: bool = False,
57
+ use_caption_generator: bool = False,
58
+ num_frames: int = 5
59
+ ) -> Tuple[str, str]:
60
+ """
61
+ Analyze uploaded image or video
62
+
63
+ Args:
64
+ file_path: Path to uploaded file
65
+ use_color_analyzer: Enable color analysis
66
+ use_object_detector: Enable object detection (heavy)
67
+ use_caption_generator: Enable caption generation (heavy)
68
+ num_frames: Number of frames to extract from video
69
+
70
+ Returns:
71
+ Tuple of (formatted results text, JSON string)
72
+ """
73
+ try:
74
+ logger.info(f"Analyzing file: {file_path}")
75
+
76
+ # Enable/disable plugins based on user selection
77
+ self._configure_plugins(
78
+ use_color_analyzer,
79
+ use_object_detector,
80
+ use_caption_generator
81
+ )
82
+
83
+ # Detect file type and analyze
84
+ file_path_obj = Path(file_path)
85
+
86
+ if file_path_obj.suffix.lower() in ['.mp4', '.avi', '.mov', '.mkv']:
87
+ # Video analysis
88
+ logger.info(f"Processing video with {num_frames} frames")
89
+ results = self.engine.analyze_video(
90
+ file_path,
91
+ extract_method="keyframes",
92
+ num_frames=num_frames
93
+ )
94
+ else:
95
+ # Image analysis
96
+ logger.info("Processing image")
97
+ results = self.engine.analyze_image(file_path)
98
+
99
+ # Format results for display
100
+ formatted_text = self._format_results(results)
101
+ json_output = json.dumps(results, indent=2, ensure_ascii=False)
102
+
103
+ logger.success("Analysis completed successfully")
104
+ return formatted_text, json_output
105
+
106
+ except Exception as e:
107
+ logger.error(f"Analysis error: {e}")
108
+ error_msg = f"❌ Error: {str(e)}"
109
+ error_json = json.dumps({"error": str(e)}, indent=2)
110
+ return error_msg, error_json
111
+
112
+ def _configure_plugins(
113
+ self,
114
+ use_color: bool,
115
+ use_object: bool,
116
+ use_caption: bool
117
+ ):
118
+ """Enable/disable plugins based on user selection"""
119
+ plugin_config = {
120
+ 'color_analyzer': use_color,
121
+ 'object_detector': use_object,
122
+ 'caption_generator': use_caption
123
+ }
124
+
125
+ for plugin_name, enabled in plugin_config.items():
126
+ if plugin_name in self.engine.plugins:
127
+ self.engine.plugins[plugin_name].enabled = enabled
128
+ logger.info(f"Plugin '{plugin_name}': {'enabled' if enabled else 'disabled'}")
129
+
130
+ def _format_results(self, results: Dict[str, Any]) -> str:
131
+ """Format analysis results as readable text"""
132
+ lines = ["# 🎯 Analysis Results\n"]
133
+
134
+ # File metadata
135
+ if "metadata" in results and "file" in results["metadata"]:
136
+ meta = results["metadata"]["file"]
137
+ lines.append("## 📁 File Information")
138
+ lines.append(f"- **Filename**: {meta.get('filename', 'N/A')}")
139
+ lines.append(f"- **Type**: {meta.get('type', 'N/A')}")
140
+ lines.append(f"- **Size**: {meta.get('size_mb', 0):.2f} MB")
141
+
142
+ if meta.get('type') == 'video':
143
+ lines.append(f"- **Resolution**: {meta.get('width')}x{meta.get('height')}")
144
+ lines.append(f"- **Duration**: {meta.get('duration', 0):.2f} seconds")
145
+ lines.append(f"- **FPS**: {meta.get('fps', 0):.2f}")
146
+ else:
147
+ lines.append(f"- **Resolution**: {meta.get('width')}x{meta.get('height')}")
148
+ lines.append("")
149
+
150
+ # Processing info
151
+ if "metadata" in results and "processing" in results["metadata"]:
152
+ proc = results["metadata"]["processing"]
153
+ lines.append("## ⚡ Processing Information")
154
+ lines.append(f"- **Duration**: {proc.get('duration_seconds', 0):.3f} seconds")
155
+ lines.append(f"- **Plugins Used**: {', '.join(proc.get('plugins_used', []))}")
156
+ if proc.get('frames_extracted'):
157
+ lines.append(f"- **Frames Analyzed**: {proc.get('frames_extracted')}")
158
+ lines.append("")
159
+
160
+ # Analysis results
161
+ if "results" in results:
162
+ res = results["results"]
163
+
164
+ # For videos
165
+ if "frames" in res:
166
+ lines.append(f"## 🎬 Video Analysis ({len(res['frames'])} frames)")
167
+
168
+ # Summary
169
+ if "summary" in res:
170
+ for plugin_name, summary_data in res["summary"].items():
171
+ lines.append(f"\n### {plugin_name.replace('_', ' ').title()}")
172
+ lines.append(f"```json\n{json.dumps(summary_data, indent=2, ensure_ascii=False)}\n```")
173
+
174
+ # For images
175
+ else:
176
+ lines.append("## 🖼️ Image Analysis")
177
+ for plugin_name, plugin_data in res.items():
178
+ lines.append(f"\n### {plugin_name.replace('_', ' ').title()}")
179
+ lines.append(f"```json\n{json.dumps(plugin_data, indent=2, ensure_ascii=False)}\n```")
180
+
181
+ return "\n".join(lines)
182
+
183
+ def create_interface(self) -> gr.Blocks:
184
+ """Create and return the Gradio interface"""
185
+
186
+ with gr.Blocks(
187
+ title="DeepVision Prompt Builder",
188
+ theme="soft",
189
+ css="""
190
+ .output-text { font-family: 'Courier New', monospace; }
191
+ .json-output { font-size: 12px; }
192
+ """
193
+ ) as demo:
194
+
195
+ # Header
196
+ gr.Markdown("""
197
+ # 🎯 DeepVision Prompt Builder
198
+ ### AI-Powered Image & Video Analysis with JSON Prompt Generation
199
+
200
+ Upload an image or video to analyze its content and generate structured JSON prompts.
201
+ """)
202
+
203
+ with gr.Row():
204
+ with gr.Column(scale=1):
205
+ # Input section
206
+ gr.Markdown("## 📤 Upload Media")
207
+
208
+ file_input = gr.File(
209
+ label="Upload Image or Video",
210
+ file_types=["image", "video"],
211
+ type="filepath"
212
+ )
213
+
214
+ gr.Markdown("### 🔌 Plugin Configuration")
215
+
216
+ color_checkbox = gr.Checkbox(
217
+ label="🎨 Color Analyzer (Fast)",
218
+ value=True,
219
+ info="Extract dominant colors and color schemes"
220
+ )
221
+
222
+ object_checkbox = gr.Checkbox(
223
+ label="🔍 Object Detector (Slow - CLIP)",
224
+ value=False,
225
+ info="Detect objects using CLIP model (~2-5GB download)"
226
+ )
227
+
228
+ caption_checkbox = gr.Checkbox(
229
+ label="💬 Caption Generator (Slow - BLIP-2)",
230
+ value=False,
231
+ info="Generate image captions (~2-5GB download)"
232
+ )
233
+
234
+ frames_slider = gr.Slider(
235
+ minimum=1,
236
+ maximum=20,
237
+ value=5,
238
+ step=1,
239
+ label="📹 Video Frames to Extract",
240
+ info="More frames = more accurate but slower"
241
+ )
242
+
243
+ analyze_btn = gr.Button(
244
+ "🚀 Analyze",
245
+ variant="primary",
246
+ size="lg"
247
+ )
248
+
249
+ with gr.Column(scale=2):
250
+ # Output section
251
+ gr.Markdown("## 📊 Analysis Results")
252
+
253
+ with gr.Tabs():
254
+ with gr.Tab("📝 Formatted"):
255
+ output_text = gr.Markdown(
256
+ label="Results",
257
+ elem_classes=["output-text"]
258
+ )
259
+
260
+ with gr.Tab("📋 JSON"):
261
+ output_json = gr.Code(
262
+ label="JSON Output",
263
+ language="json",
264
+ elem_classes=["json-output"],
265
+ lines=20
266
+ )
267
+
268
+ download_btn = gr.DownloadButton(
269
+ label="💾 Download JSON",
270
+ visible=False
271
+ )
272
+
273
+ # Examples
274
+ gr.Markdown("## 💡 Example Usage")
275
+ gr.Markdown("""
276
+ 1. **Quick Test**: Upload an image with only Color Analyzer enabled
277
+ 2. **Full Analysis**: Enable all plugins (requires model downloads)
278
+ 3. **Video Analysis**: Upload a video and adjust frame count
279
+
280
+ **Note**: First-time use of Object Detector and Caption Generator will download ~2-5GB models.
281
+ """)
282
+
283
+ # Footer
284
+ gr.Markdown("""
285
+ ---
286
+ **DeepVision Prompt Builder v0.1.0** | Built with ❤️ by AI Dev Collective
287
+
288
+ 📚 [Documentation](https://github.com/yourusername/deepvision) |
289
+ 🐛 [Report Issues](https://github.com/yourusername/deepvision/issues)
290
+ """)
291
+
292
+ # Event handlers
293
+ def analyze_and_prepare_download(file, color, obj, cap, frames):
294
+ """Analyze and prepare results for download"""
295
+ if file is None:
296
+ return "⚠️ Please upload a file first", "{}", gr.update(visible=False)
297
+
298
+ text_result, json_result = self.analyze_media(
299
+ file, color, obj, cap, frames
300
+ )
301
+
302
+ # Save JSON to temp file for download
303
+ temp_file = tempfile.NamedTemporaryFile(
304
+ mode='w',
305
+ suffix='.json',
306
+ delete=False,
307
+ encoding='utf-8'
308
+ )
309
+ temp_file.write(json_result)
310
+ temp_file.close()
311
+
312
+ return (
313
+ text_result,
314
+ json_result,
315
+ gr.update(visible=True, value=temp_file.name)
316
+ )
317
+
318
+ analyze_btn.click(
319
+ fn=analyze_and_prepare_download,
320
+ inputs=[
321
+ file_input,
322
+ color_checkbox,
323
+ object_checkbox,
324
+ caption_checkbox,
325
+ frames_slider
326
+ ],
327
+ outputs=[output_text, output_json, download_btn]
328
+ )
329
+
330
+ return demo
331
+
332
+
333
+ def main():
334
+ """Main entry point for the Gradio app"""
335
+ app = DeepVisionGradioApp()
336
+ demo = app.create_interface()
337
+
338
+ # Launch the app
339
+ demo.launch(
340
+ server_name="127.0.0.1", # Local only for testing
341
+ server_port=None, # Auto-find available port
342
+ share=False, # Set to True for temporary public link
343
+ show_error=True,
344
+ inbrowser=True # Auto-open in browser
345
+ )
346
+
347
+
348
+ if __name__ == "__main__":
349
+ main()
core/__init__.py ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ DeepVision Prompt Builder - Core Engine
3
+
4
+ This module contains the core functionality for analyzing images and videos,
5
+ managing plugins, and generating structured JSON prompts.
6
+ """
7
+
8
+ __version__ = "0.1.0"
9
+ __author__ = "AI Dev Collective v9.0"
10
+
11
+ from core.engine import AnalysisEngine
12
+ from core.image_processor import ImageProcessor
13
+ from core.video_processor import VideoProcessor
14
+ from core.result_manager import ResultManager
15
+
16
+ __all__ = [
17
+ "AnalysisEngine",
18
+ "ImageProcessor",
19
+ "VideoProcessor",
20
+ "ResultManager",
21
+ ]
core/config.py ADDED
@@ -0,0 +1,131 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Configuration module for DeepVision Core Engine.
3
+
4
+ Manages all configuration settings including paths, model settings,
5
+ processing parameters, and resource limits.
6
+ """
7
+
8
+ import os
9
+ from pathlib import Path
10
+ from typing import Dict, List, Optional
11
+ from pydantic_settings import BaseSettings
12
+ from pydantic import Field
13
+
14
+
15
+ class CoreConfig(BaseSettings):
16
+ """Core configuration settings."""
17
+
18
+ # Application
19
+ APP_NAME: str = "DeepVision Prompt Builder"
20
+ APP_VERSION: str = "0.1.0"
21
+ DEBUG: bool = Field(default=False, env="DEBUG")
22
+
23
+ # Paths
24
+ BASE_DIR: Path = Path(__file__).parent.parent
25
+ UPLOAD_DIR: Path = Field(default=Path("/var/uploads"), env="UPLOAD_DIR")
26
+ CACHE_DIR: Path = Field(default=Path("/var/cache"), env="CACHE_DIR")
27
+ MODEL_DIR: Path = Field(default=Path("models"), env="MODEL_DIR")
28
+
29
+ # File Processing
30
+ MAX_IMAGE_SIZE: int = Field(default=50 * 1024 * 1024, env="MAX_IMAGE_SIZE") # 50MB
31
+ MAX_VIDEO_SIZE: int = Field(default=200 * 1024 * 1024, env="MAX_VIDEO_SIZE") # 200MB
32
+ ALLOWED_IMAGE_FORMATS: List[str] = [".jpg", ".jpeg", ".png", ".gif", ".webp"]
33
+ ALLOWED_VIDEO_FORMATS: List[str] = [".mp4", ".mov", ".avi"]
34
+
35
+ # Image Processing
36
+ IMAGE_MAX_DIMENSION: int = 2048 # Max width or height
37
+ IMAGE_QUALITY: int = 85 # JPEG quality
38
+ DEFAULT_IMAGE_SIZE: tuple = (512, 512) # Default resize
39
+
40
+ # Video Processing
41
+ VIDEO_FPS_EXTRACTION: int = 1 # Extract 1 frame per second
42
+ MAX_FRAMES_PER_VIDEO: int = 100 # Maximum frames to extract
43
+
44
+ # Model Settings
45
+ DEVICE: str = Field(default="cpu", env="DEVICE") # cpu or cuda
46
+ MODEL_BATCH_SIZE: int = 4
47
+ MODEL_CACHE_SIZE: int = 3 # Max models in memory
48
+
49
+ # Performance
50
+ MAX_WORKERS: int = Field(default=4, env="MAX_WORKERS")
51
+ ENABLE_CACHING: bool = True
52
+ CACHE_TTL: int = 3600 # Cache time-to-live in seconds
53
+
54
+ # Output
55
+ OUTPUT_FORMAT: str = "json" # json, dict
56
+ PRETTY_JSON: bool = True
57
+ INCLUDE_METADATA: bool = True
58
+
59
+ class Config:
60
+ env_file = ".env"
61
+ env_file_encoding = "utf-8"
62
+ case_sensitive = True
63
+
64
+ def __init__(self, **kwargs):
65
+ super().__init__(**kwargs)
66
+ # Create directories if they don't exist
67
+ self.UPLOAD_DIR.mkdir(parents=True, exist_ok=True)
68
+ self.CACHE_DIR.mkdir(parents=True, exist_ok=True)
69
+ self.MODEL_DIR.mkdir(parents=True, exist_ok=True)
70
+
71
+
72
+ # Global config instance
73
+ config = CoreConfig()
74
+
75
+
76
+ # Model configurations
77
+ MODEL_CONFIGS: Dict[str, Dict] = {
78
+ "clip": {
79
+ "name": "openai/clip-vit-base-patch32",
80
+ "task": "feature_extraction",
81
+ "device": config.DEVICE,
82
+ },
83
+ "blip2": {
84
+ "name": "Salesforce/blip2-opt-2.7b",
85
+ "task": "image_captioning",
86
+ "device": config.DEVICE,
87
+ },
88
+ "sam": {
89
+ "name": "facebook/sam-vit-base",
90
+ "task": "segmentation",
91
+ "device": config.DEVICE,
92
+ },
93
+ }
94
+
95
+
96
+ # Plugin configurations
97
+ PLUGIN_CONFIGS: Dict[str, Dict] = {
98
+ "object_detector": {
99
+ "enabled": True,
100
+ "model": "clip",
101
+ "confidence_threshold": 0.5,
102
+ },
103
+ "caption_generator": {
104
+ "enabled": True,
105
+ "model": "blip2",
106
+ "max_length": 50,
107
+ },
108
+ "color_analyzer": {
109
+ "enabled": True,
110
+ "num_colors": 5,
111
+ },
112
+ "text_extractor": {
113
+ "enabled": False, # Requires OCR model
114
+ "model": "easyocr",
115
+ },
116
+ "emotion_reader": {
117
+ "enabled": False, # Requires face detection model
118
+ "model": "deepface",
119
+ },
120
+ }
121
+
122
+
123
+ def get_plugin_config(plugin_name: str) -> Optional[Dict]:
124
+ """Get configuration for a specific plugin."""
125
+ return PLUGIN_CONFIGS.get(plugin_name)
126
+
127
+
128
+ def is_plugin_enabled(plugin_name: str) -> bool:
129
+ """Check if a plugin is enabled."""
130
+ plugin_config = get_plugin_config(plugin_name)
131
+ return plugin_config.get("enabled", False) if plugin_config else False
core/engine.py ADDED
@@ -0,0 +1,471 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Core Analysis Engine
3
+
4
+ Main orchestration engine for DeepVision Prompt Builder.
5
+ Manages image/video processing, plugin execution, and result generation.
6
+ """
7
+
8
+ from datetime import datetime
9
+ from pathlib import Path
10
+ from typing import Dict, List, Any, Optional, Union
11
+ from loguru import logger
12
+
13
+ from core.config import config
14
+ from core.image_processor import ImageProcessor
15
+ from core.video_processor import VideoProcessor
16
+ from core.result_manager import ResultManager
17
+ from core.exceptions import DeepVisionError
18
+
19
+
20
+ class AnalysisEngine:
21
+ """
22
+ Main analysis engine for processing images and videos.
23
+
24
+ Orchestrates the complete analysis pipeline:
25
+ 1. File validation and preprocessing
26
+ 2. Plugin execution
27
+ 3. Result aggregation
28
+ 4. JSON output generation
29
+ """
30
+
31
+ def __init__(self):
32
+ """Initialize AnalysisEngine."""
33
+ self.image_processor = ImageProcessor()
34
+ self.video_processor = VideoProcessor()
35
+ self.result_manager = ResultManager()
36
+ self.plugins: Dict[str, Any] = {}
37
+ self.plugin_order: List[str] = []
38
+
39
+ logger.info(f"AnalysisEngine initialized - {config.APP_NAME} v{config.APP_VERSION}")
40
+
41
+ def register_plugin(self, plugin_name: str, plugin_instance: Any) -> None:
42
+ """
43
+ Register a plugin for analysis.
44
+
45
+ Args:
46
+ plugin_name: Unique name for the plugin
47
+ plugin_instance: Instance of the plugin class
48
+ """
49
+ if plugin_name in self.plugins:
50
+ logger.warning(f"Plugin '{plugin_name}' already registered, replacing")
51
+
52
+ self.plugins[plugin_name] = plugin_instance
53
+
54
+ # Maintain execution order
55
+ if plugin_name not in self.plugin_order:
56
+ self.plugin_order.append(plugin_name)
57
+
58
+ logger.info(f"Registered plugin: {plugin_name}")
59
+
60
+ def unregister_plugin(self, plugin_name: str) -> None:
61
+ """
62
+ Unregister a plugin.
63
+
64
+ Args:
65
+ plugin_name: Name of plugin to remove
66
+ """
67
+ if plugin_name in self.plugins:
68
+ del self.plugins[plugin_name]
69
+
70
+ if plugin_name in self.plugin_order:
71
+ self.plugin_order.remove(plugin_name)
72
+
73
+ logger.info(f"Unregistered plugin: {plugin_name}")
74
+
75
+ def get_registered_plugins(self) -> List[str]:
76
+ """
77
+ Get list of registered plugins.
78
+
79
+ Returns:
80
+ List of plugin names
81
+ """
82
+ return list(self.plugins.keys())
83
+
84
+ def analyze_image(
85
+ self,
86
+ image_path: Union[str, Path],
87
+ plugins: Optional[List[str]] = None,
88
+ **kwargs
89
+ ) -> Dict[str, Any]:
90
+ """
91
+ Analyze a single image.
92
+
93
+ Args:
94
+ image_path: Path to image file
95
+ plugins: List of plugin names to use (None for all)
96
+ **kwargs: Additional arguments for processing
97
+
98
+ Returns:
99
+ Analysis results dictionary
100
+ """
101
+ start_time = datetime.now()
102
+ image_path = Path(image_path)
103
+
104
+ logger.info(f"Starting image analysis: {image_path.name}")
105
+
106
+ try:
107
+ # Clear previous results
108
+ self.result_manager.clear()
109
+
110
+ # Process image
111
+ image = self.image_processor.process(
112
+ image_path,
113
+ resize=kwargs.get("resize", True),
114
+ normalize=kwargs.get("normalize", False)
115
+ )
116
+
117
+ # Get image info
118
+ image_info = self.image_processor.get_image_info(image_path)
119
+
120
+ # Set file metadata
121
+ self.result_manager.set_file_info(
122
+ filename=image_info["filename"],
123
+ file_type="image",
124
+ file_size=image_info["file_size"],
125
+ width=image_info["width"],
126
+ height=image_info["height"],
127
+ format=image_info["format"],
128
+ hash=image_info["hash"],
129
+ )
130
+
131
+ # Execute plugins
132
+ plugins_used = self._execute_plugins(
133
+ image,
134
+ image_path,
135
+ plugins,
136
+ media_type="image"
137
+ )
138
+
139
+ # Set processing metadata
140
+ end_time = datetime.now()
141
+ self.result_manager.set_processing_info(
142
+ start_time=start_time,
143
+ end_time=end_time,
144
+ plugins_used=plugins_used
145
+ )
146
+
147
+ # Get final results
148
+ results = self.result_manager.to_dict(
149
+ include_metadata=config.INCLUDE_METADATA
150
+ )
151
+
152
+ logger.info(f"Image analysis completed: {image_path.name} "
153
+ f"({len(plugins_used)} plugins)")
154
+
155
+ return results
156
+
157
+ except Exception as e:
158
+ logger.error(f"Image analysis failed: {e}")
159
+ raise DeepVisionError(
160
+ f"Analysis failed for {image_path.name}: {str(e)}",
161
+ {"path": str(image_path), "error": str(e)}
162
+ )
163
+
164
+ def analyze_video(
165
+ self,
166
+ video_path: Union[str, Path],
167
+ plugins: Optional[List[str]] = None,
168
+ extract_method: str = "keyframes",
169
+ num_frames: int = 5,
170
+ **kwargs
171
+ ) -> Dict[str, Any]:
172
+ """
173
+ Analyze a video by extracting and analyzing frames.
174
+
175
+ Args:
176
+ video_path: Path to video file
177
+ plugins: List of plugin names to use
178
+ extract_method: Frame extraction method ("fps" or "keyframes")
179
+ num_frames: Number of frames to extract
180
+ **kwargs: Additional arguments
181
+
182
+ Returns:
183
+ Analysis results dictionary
184
+ """
185
+ start_time = datetime.now()
186
+ video_path = Path(video_path)
187
+
188
+ logger.info(f"Starting video analysis: {video_path.name}")
189
+
190
+ try:
191
+ # Clear previous results
192
+ self.result_manager.clear()
193
+
194
+ # Get video info
195
+ video_info = self.video_processor.get_video_info(video_path)
196
+
197
+ # Set file metadata
198
+ self.result_manager.set_file_info(
199
+ filename=video_info["filename"],
200
+ file_type="video",
201
+ file_size=video_info["file_size"],
202
+ width=video_info["width"],
203
+ height=video_info["height"],
204
+ fps=video_info["fps"],
205
+ duration=video_info["duration"],
206
+ frame_count=video_info["frame_count"],
207
+ )
208
+
209
+ # Extract frames
210
+ if extract_method == "keyframes":
211
+ frame_paths = self.video_processor.extract_key_frames(
212
+ video_path,
213
+ num_frames=num_frames
214
+ )
215
+ else:
216
+ frame_paths = self.video_processor.extract_frames(
217
+ video_path,
218
+ max_frames=num_frames,
219
+ **kwargs
220
+ )
221
+
222
+ logger.info(f"Extracted {len(frame_paths)} frames from video")
223
+
224
+ # Analyze each frame
225
+ frame_results = []
226
+ for idx, frame_path in enumerate(frame_paths):
227
+ logger.info(f"Analyzing frame {idx + 1}/{len(frame_paths)}")
228
+
229
+ # Process frame
230
+ image = self.image_processor.process(frame_path, resize=True)
231
+
232
+ # Execute plugins on frame
233
+ plugins_used = self._execute_plugins(
234
+ image,
235
+ frame_path,
236
+ plugins,
237
+ media_type="video_frame"
238
+ )
239
+
240
+ # Get frame results
241
+ frame_result = {
242
+ "frame_index": idx,
243
+ "frame_path": str(frame_path.name),
244
+ "results": dict(self.result_manager.results)
245
+ }
246
+ frame_results.append(frame_result)
247
+
248
+ # Clear for next frame
249
+ self.result_manager.results.clear()
250
+
251
+ # Aggregate frame results
252
+ aggregated = self._aggregate_video_results(frame_results)
253
+
254
+ # Set aggregated results
255
+ self.result_manager.results = aggregated
256
+
257
+ # Set processing metadata
258
+ end_time = datetime.now()
259
+ self.result_manager.set_processing_info(
260
+ start_time=start_time,
261
+ end_time=end_time,
262
+ plugins_used=plugins_used
263
+ )
264
+
265
+ # Add video-specific metadata
266
+ self.result_manager.add_metadata({
267
+ "frames_analyzed": len(frame_paths),
268
+ "extraction_method": extract_method,
269
+ })
270
+
271
+ # Get final results
272
+ results = self.result_manager.to_dict(
273
+ include_metadata=config.INCLUDE_METADATA
274
+ )
275
+
276
+ logger.info(f"Video analysis completed: {video_path.name} "
277
+ f"({len(frame_paths)} frames, {len(plugins_used)} plugins)")
278
+
279
+ return results
280
+
281
+ except Exception as e:
282
+ logger.error(f"Video analysis failed: {e}")
283
+ raise DeepVisionError(
284
+ f"Analysis failed for {video_path.name}: {str(e)}",
285
+ {"path": str(video_path), "error": str(e)}
286
+ )
287
+
288
+ def _execute_plugins(
289
+ self,
290
+ media,
291
+ media_path: Path,
292
+ plugin_names: Optional[List[str]] = None,
293
+ media_type: str = "image"
294
+ ) -> List[str]:
295
+ """
296
+ Execute registered plugins on media.
297
+
298
+ Args:
299
+ media: Processed media (image or frame)
300
+ media_path: Path to media file
301
+ plugin_names: List of plugins to execute (None for all)
302
+ media_type: Type of media being processed
303
+
304
+ Returns:
305
+ List of executed plugin names
306
+ """
307
+ # Determine which plugins to execute
308
+ if plugin_names is None:
309
+ plugins_to_run = self.plugin_order
310
+ else:
311
+ plugins_to_run = [
312
+ p for p in self.plugin_order if p in plugin_names
313
+ ]
314
+
315
+ executed = []
316
+
317
+ for plugin_name in plugins_to_run:
318
+ if plugin_name not in self.plugins:
319
+ logger.warning(f"Plugin '{plugin_name}' not found, skipping")
320
+ continue
321
+
322
+ try:
323
+ logger.debug(f"Executing plugin: {plugin_name}")
324
+
325
+ plugin = self.plugins[plugin_name]
326
+
327
+ # Execute plugin
328
+ result = plugin.analyze(media, media_path)
329
+
330
+ # Add result
331
+ self.result_manager.add_result(plugin_name, result)
332
+
333
+ executed.append(plugin_name)
334
+
335
+ logger.debug(f"Plugin '{plugin_name}' completed successfully")
336
+
337
+ except Exception as e:
338
+ logger.error(f"Plugin '{plugin_name}' failed: {e}")
339
+
340
+ # Add error to results
341
+ self.result_manager.add_result(
342
+ plugin_name,
343
+ {
344
+ "error": str(e),
345
+ "status": "failed"
346
+ }
347
+ )
348
+
349
+ return executed
350
+
351
+ def _aggregate_video_results(
352
+ self,
353
+ frame_results: List[Dict[str, Any]]
354
+ ) -> Dict[str, Any]:
355
+ """
356
+ Aggregate results from multiple video frames.
357
+
358
+ Args:
359
+ frame_results: List of results from each frame
360
+
361
+ Returns:
362
+ Aggregated results dictionary
363
+ """
364
+ aggregated = {
365
+ "frames": frame_results,
366
+ "summary": {}
367
+ }
368
+
369
+ # For each plugin, aggregate results across frames
370
+ if not frame_results:
371
+ return aggregated
372
+
373
+ # Get plugin names from first frame
374
+ first_frame = frame_results[0]["results"]
375
+
376
+ for plugin_name in first_frame.keys():
377
+ plugin_summary = self._aggregate_plugin_results(
378
+ plugin_name,
379
+ [f["results"].get(plugin_name, {}) for f in frame_results]
380
+ )
381
+ aggregated["summary"][plugin_name] = plugin_summary
382
+
383
+ return aggregated
384
+
385
+ def _aggregate_plugin_results(
386
+ self,
387
+ plugin_name: str,
388
+ results: List[Dict[str, Any]]
389
+ ) -> Dict[str, Any]:
390
+ """
391
+ Aggregate results for a specific plugin across frames.
392
+
393
+ Args:
394
+ plugin_name: Name of the plugin
395
+ results: List of results from each frame
396
+
397
+ Returns:
398
+ Aggregated result for the plugin
399
+ """
400
+ # Default aggregation: collect all unique values
401
+ aggregated = {
402
+ "frames_processed": len(results),
403
+ }
404
+
405
+ # Plugin-specific aggregation logic
406
+ if plugin_name == "object_detector":
407
+ all_objects = []
408
+ for result in results:
409
+ all_objects.extend(result.get("objects", []))
410
+
411
+ # Count object occurrences
412
+ object_counts = {}
413
+ for obj in all_objects:
414
+ name = obj["name"]
415
+ object_counts[name] = object_counts.get(name, 0) + 1
416
+
417
+ aggregated["total_objects"] = len(all_objects)
418
+ aggregated["unique_objects"] = len(object_counts)
419
+ aggregated["object_frequency"] = object_counts
420
+
421
+ elif plugin_name == "caption_generator":
422
+ captions = [r.get("caption", "") for r in results if r.get("caption")]
423
+ aggregated["captions"] = captions
424
+ aggregated["caption_count"] = len(captions)
425
+
426
+ elif plugin_name == "color_analyzer":
427
+ all_colors = []
428
+ for result in results:
429
+ all_colors.extend(result.get("dominant_colors", []))
430
+
431
+ # Get most frequent colors
432
+ color_counts = {}
433
+ for color in all_colors:
434
+ name = color["name"]
435
+ color_counts[name] = color_counts.get(name, 0) + 1
436
+
437
+ aggregated["color_frequency"] = color_counts
438
+
439
+ return aggregated
440
+
441
+ def analyze(
442
+ self,
443
+ file_path: Union[str, Path],
444
+ **kwargs
445
+ ) -> Dict[str, Any]:
446
+ """
447
+ Automatically detect file type and analyze.
448
+
449
+ Args:
450
+ file_path: Path to image or video file
451
+ **kwargs: Additional arguments
452
+
453
+ Returns:
454
+ Analysis results
455
+ """
456
+ file_path = Path(file_path)
457
+
458
+ # Detect file type
459
+ ext = file_path.suffix.lower()
460
+
461
+ if ext in config.ALLOWED_IMAGE_FORMATS:
462
+ return self.analyze_image(file_path, **kwargs)
463
+ elif ext in config.ALLOWED_VIDEO_FORMATS:
464
+ return self.analyze_video(file_path, **kwargs)
465
+ else:
466
+ raise ValueError(f"Unsupported file format: {ext}")
467
+
468
+ def __repr__(self) -> str:
469
+ """Object representation."""
470
+ return (f"AnalysisEngine(plugins={len(self.plugins)}, "
471
+ f"registered={self.get_registered_plugins()})")
core/exceptions.py ADDED
@@ -0,0 +1,100 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Custom exceptions for DeepVision Core Engine.
3
+
4
+ Defines all custom exceptions used throughout the core engine
5
+ for better error handling and debugging.
6
+ """
7
+
8
+
9
+ class DeepVisionError(Exception):
10
+ """Base exception for all DeepVision errors."""
11
+
12
+ def __init__(self, message: str, details: dict = None):
13
+ self.message = message
14
+ self.details = details or {}
15
+ super().__init__(self.message)
16
+
17
+
18
+ class FileProcessingError(DeepVisionError):
19
+ """Raised when file processing fails."""
20
+ pass
21
+
22
+
23
+ class InvalidFileError(FileProcessingError):
24
+ """Raised when file is invalid or corrupted."""
25
+ pass
26
+
27
+
28
+ class FileSizeError(FileProcessingError):
29
+ """Raised when file size exceeds limits."""
30
+ pass
31
+
32
+
33
+ class UnsupportedFormatError(FileProcessingError):
34
+ """Raised when file format is not supported."""
35
+ pass
36
+
37
+
38
+ class ImageProcessingError(DeepVisionError):
39
+ """Raised when image processing fails."""
40
+ pass
41
+
42
+
43
+ class VideoProcessingError(DeepVisionError):
44
+ """Raised when video processing fails."""
45
+ pass
46
+
47
+
48
+ class FrameExtractionError(VideoProcessingError):
49
+ """Raised when frame extraction from video fails."""
50
+ pass
51
+
52
+
53
+ class ModelError(DeepVisionError):
54
+ """Raised when model operations fail."""
55
+ pass
56
+
57
+
58
+ class ModelLoadError(ModelError):
59
+ """Raised when model loading fails."""
60
+ pass
61
+
62
+
63
+ class ModelInferenceError(ModelError):
64
+ """Raised when model inference fails."""
65
+ pass
66
+
67
+
68
+ class PluginError(DeepVisionError):
69
+ """Raised when plugin operations fail."""
70
+ pass
71
+
72
+
73
+ class PluginLoadError(PluginError):
74
+ """Raised when plugin loading fails."""
75
+ pass
76
+
77
+
78
+ class PluginExecutionError(PluginError):
79
+ """Raised when plugin execution fails."""
80
+ pass
81
+
82
+
83
+ class ValidationError(DeepVisionError):
84
+ """Raised when validation fails."""
85
+ pass
86
+
87
+
88
+ class ConfigurationError(DeepVisionError):
89
+ """Raised when configuration is invalid."""
90
+ pass
91
+
92
+
93
+ class CacheError(DeepVisionError):
94
+ """Raised when cache operations fail."""
95
+ pass
96
+
97
+
98
+ class ResultError(DeepVisionError):
99
+ """Raised when result processing fails."""
100
+ pass
core/image_processor.py ADDED
@@ -0,0 +1,279 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Image Processor Module
3
+
4
+ Handles all image processing operations including loading, validation,
5
+ resizing, normalization, and format conversion.
6
+ """
7
+
8
+ import hashlib
9
+ import magic
10
+ from pathlib import Path
11
+ from typing import Tuple, Optional, Union
12
+ import numpy as np
13
+ from PIL import Image, ImageOps
14
+ from loguru import logger
15
+
16
+ from core.config import config
17
+ from core.exceptions import (
18
+ ImageProcessingError,
19
+ InvalidFileError,
20
+ FileSizeError,
21
+ UnsupportedFormatError,
22
+ )
23
+
24
+
25
+ class ImageProcessor:
26
+ """
27
+ Process images for analysis.
28
+
29
+ Handles validation, resizing, normalization, and format conversion
30
+ for images before they are passed to AI models.
31
+ """
32
+
33
+ def __init__(self):
34
+ """Initialize ImageProcessor."""
35
+ self.max_size = config.MAX_IMAGE_SIZE
36
+ self.max_dimension = config.IMAGE_MAX_DIMENSION
37
+ self.allowed_formats = config.ALLOWED_IMAGE_FORMATS
38
+ logger.info("ImageProcessor initialized")
39
+
40
+ def load_image(self, image_path: Union[str, Path]) -> Image.Image:
41
+ """
42
+ Load image from file path.
43
+
44
+ Args:
45
+ image_path: Path to image file
46
+
47
+ Returns:
48
+ PIL Image object
49
+
50
+ Raises:
51
+ InvalidFileError: If image cannot be loaded
52
+ """
53
+ try:
54
+ image_path = Path(image_path)
55
+ if not image_path.exists():
56
+ raise InvalidFileError(
57
+ f"Image file not found: {image_path}",
58
+ {"path": str(image_path)}
59
+ )
60
+
61
+ # Validate file
62
+ self.validate_image(image_path)
63
+
64
+ # Load image
65
+ image = Image.open(image_path)
66
+
67
+ # Convert to RGB if necessary
68
+ if image.mode != "RGB":
69
+ image = image.convert("RGB")
70
+
71
+ logger.info(f"Loaded image: {image_path.name} ({image.size})")
72
+ return image
73
+
74
+ except Exception as e:
75
+ logger.error(f"Failed to load image: {e}")
76
+ raise InvalidFileError(
77
+ f"Cannot load image: {str(e)}",
78
+ {"path": str(image_path), "error": str(e)}
79
+ )
80
+
81
+ def validate_image(self, image_path: Path) -> bool:
82
+ """
83
+ Validate image file.
84
+
85
+ Args:
86
+ image_path: Path to image file
87
+
88
+ Returns:
89
+ True if valid
90
+
91
+ Raises:
92
+ FileSizeError: If file too large
93
+ UnsupportedFormatError: If format not supported
94
+ InvalidFileError: If file is corrupted
95
+ """
96
+ # Check file size
97
+ file_size = image_path.stat().st_size
98
+ if file_size > self.max_size:
99
+ raise FileSizeError(
100
+ f"Image too large: {file_size / 1024 / 1024:.1f}MB",
101
+ {"max_size": self.max_size, "actual_size": file_size}
102
+ )
103
+
104
+ # Check file extension
105
+ ext = image_path.suffix.lower()
106
+ if ext not in self.allowed_formats:
107
+ raise UnsupportedFormatError(
108
+ f"Unsupported image format: {ext}",
109
+ {"allowed": self.allowed_formats, "received": ext}
110
+ )
111
+
112
+ # Check MIME type using magic bytes
113
+ try:
114
+ mime = magic.from_file(str(image_path), mime=True)
115
+ if not mime.startswith("image/"):
116
+ raise InvalidFileError(
117
+ f"File is not a valid image: {mime}",
118
+ {"mime_type": mime}
119
+ )
120
+ except Exception as e:
121
+ logger.warning(f"Could not verify MIME type: {e}")
122
+
123
+ return True
124
+
125
+ def resize_image(
126
+ self,
127
+ image: Image.Image,
128
+ max_size: Optional[Tuple[int, int]] = None,
129
+ maintain_aspect_ratio: bool = True
130
+ ) -> Image.Image:
131
+ """
132
+ Resize image to specified dimensions.
133
+
134
+ Args:
135
+ image: PIL Image object
136
+ max_size: Maximum (width, height) tuple
137
+ maintain_aspect_ratio: Whether to maintain aspect ratio
138
+
139
+ Returns:
140
+ Resized PIL Image
141
+ """
142
+ if max_size is None:
143
+ max_size = config.DEFAULT_IMAGE_SIZE
144
+
145
+ original_size = image.size
146
+
147
+ if maintain_aspect_ratio:
148
+ # Calculate new size maintaining aspect ratio
149
+ image.thumbnail(max_size, Image.Resampling.LANCZOS)
150
+ else:
151
+ # Resize to exact dimensions
152
+ image = image.resize(max_size, Image.Resampling.LANCZOS)
153
+
154
+ logger.debug(f"Resized image: {original_size} -> {image.size}")
155
+ return image
156
+
157
+ def normalize_image(self, image: Image.Image) -> np.ndarray:
158
+ """
159
+ Normalize image to numpy array with values [0, 1].
160
+
161
+ Args:
162
+ image: PIL Image object
163
+
164
+ Returns:
165
+ Normalized numpy array (H, W, C)
166
+ """
167
+ # Convert to numpy array
168
+ img_array = np.array(image, dtype=np.float32)
169
+
170
+ # Normalize to [0, 1]
171
+ img_array = img_array / 255.0
172
+
173
+ logger.debug(f"Normalized image to shape: {img_array.shape}")
174
+ return img_array
175
+
176
+ def apply_exif_orientation(self, image: Image.Image) -> Image.Image:
177
+ """
178
+ Apply EXIF orientation to image.
179
+
180
+ Args:
181
+ image: PIL Image object
182
+
183
+ Returns:
184
+ Oriented PIL Image
185
+ """
186
+ try:
187
+ image = ImageOps.exif_transpose(image)
188
+ logger.debug("Applied EXIF orientation")
189
+ except Exception as e:
190
+ logger.warning(f"Could not apply EXIF orientation: {e}")
191
+
192
+ return image
193
+
194
+ def get_image_hash(self, image_path: Path) -> str:
195
+ """
196
+ Generate SHA256 hash of image file.
197
+
198
+ Args:
199
+ image_path: Path to image file
200
+
201
+ Returns:
202
+ Hex string of hash
203
+ """
204
+ sha256_hash = hashlib.sha256()
205
+
206
+ with open(image_path, "rb") as f:
207
+ # Read in chunks to handle large files
208
+ for chunk in iter(lambda: f.read(8192), b""):
209
+ sha256_hash.update(chunk)
210
+
211
+ return sha256_hash.hexdigest()
212
+
213
+ def process(
214
+ self,
215
+ image_path: Union[str, Path],
216
+ resize: bool = True,
217
+ normalize: bool = False,
218
+ apply_orientation: bool = True
219
+ ) -> Union[Image.Image, np.ndarray]:
220
+ """
221
+ Complete image processing pipeline.
222
+
223
+ Args:
224
+ image_path: Path to image file
225
+ resize: Whether to resize image
226
+ normalize: Whether to normalize to numpy array
227
+ apply_orientation: Whether to apply EXIF orientation
228
+
229
+ Returns:
230
+ Processed image (PIL Image or numpy array)
231
+ """
232
+ try:
233
+ # Load image
234
+ image = self.load_image(image_path)
235
+
236
+ # Apply EXIF orientation
237
+ if apply_orientation:
238
+ image = self.apply_exif_orientation(image)
239
+
240
+ # Resize if needed
241
+ if resize:
242
+ image = self.resize_image(image)
243
+
244
+ # Normalize if needed
245
+ if normalize:
246
+ return self.normalize_image(image)
247
+
248
+ return image
249
+
250
+ except Exception as e:
251
+ logger.error(f"Image processing failed: {e}")
252
+ raise ImageProcessingError(
253
+ f"Failed to process image: {str(e)}",
254
+ {"path": str(image_path), "error": str(e)}
255
+ )
256
+
257
+ def get_image_info(self, image_path: Union[str, Path]) -> dict:
258
+ """
259
+ Get information about an image.
260
+
261
+ Args:
262
+ image_path: Path to image file
263
+
264
+ Returns:
265
+ Dictionary with image information
266
+ """
267
+ image_path = Path(image_path)
268
+ image = self.load_image(image_path)
269
+
270
+ return {
271
+ "filename": image_path.name,
272
+ "format": image.format,
273
+ "mode": image.mode,
274
+ "size": image.size,
275
+ "width": image.size[0],
276
+ "height": image.size[1],
277
+ "file_size": image_path.stat().st_size,
278
+ "hash": self.get_image_hash(image_path),
279
+ }
core/logging_config.py ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Logging Configuration
3
+
4
+ Setup logging for DeepVision using loguru.
5
+ """
6
+
7
+ import sys
8
+ from pathlib import Path
9
+ from loguru import logger
10
+
11
+ from core.config import config
12
+
13
+
14
+ def setup_logging(
15
+ log_level: str = "INFO",
16
+ log_file: Path = None,
17
+ rotation: str = "10 MB",
18
+ retention: str = "1 week"
19
+ ) -> None:
20
+ """
21
+ Setup logging configuration.
22
+
23
+ Args:
24
+ log_level: Logging level (DEBUG, INFO, WARNING, ERROR)
25
+ log_file: Path to log file (None for no file logging)
26
+ rotation: Log rotation size/time
27
+ retention: How long to keep old logs
28
+ """
29
+ # Remove default handler
30
+ logger.remove()
31
+
32
+ # Console handler
33
+ logger.add(
34
+ sys.stderr,
35
+ format="<green>{time:YYYY-MM-DD HH:mm:ss}</green> | "
36
+ "<level>{level: <8}</level> | "
37
+ "<cyan>{name}</cyan>:<cyan>{function}</cyan> - "
38
+ "<level>{message}</level>",
39
+ level=log_level,
40
+ colorize=True,
41
+ )
42
+
43
+ # File handler (if specified)
44
+ if log_file:
45
+ log_file = Path(log_file)
46
+ log_file.parent.mkdir(parents=True, exist_ok=True)
47
+
48
+ logger.add(
49
+ log_file,
50
+ format="{time:YYYY-MM-DD HH:mm:ss} | {level: <8} | "
51
+ "{name}:{function} - {message}",
52
+ level=log_level,
53
+ rotation=rotation,
54
+ retention=retention,
55
+ compression="zip",
56
+ )
57
+
58
+ logger.info(f"Logging configured - Level: {log_level}")
59
+
60
+
61
+ # Auto-configure if imported
62
+ if config.DEBUG:
63
+ setup_logging(
64
+ log_level="DEBUG",
65
+ log_file=config.BASE_DIR / "logs" / "deepvision.log"
66
+ )
67
+ else:
68
+ setup_logging(
69
+ log_level="INFO",
70
+ log_file=config.BASE_DIR / "logs" / "deepvision.log"
71
+ )
core/result_manager.py ADDED
@@ -0,0 +1,347 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Result Manager Module
3
+
4
+ Manages and aggregates results from multiple plugins,
5
+ generates final JSON output with metadata.
6
+ """
7
+
8
+ import json
9
+ from datetime import datetime
10
+ from pathlib import Path
11
+ from typing import Dict, List, Any, Optional, Union
12
+ from loguru import logger
13
+
14
+ from core.config import config
15
+ from core.exceptions import ResultError
16
+
17
+
18
+ class ResultManager:
19
+ """
20
+ Manage and aggregate analysis results.
21
+
22
+ Collects results from multiple plugins, merges them,
23
+ and generates structured JSON output.
24
+ """
25
+
26
+ def __init__(self):
27
+ """Initialize ResultManager."""
28
+ self.results: Dict[str, Any] = {}
29
+ self.metadata: Dict[str, Any] = {}
30
+ logger.info("ResultManager initialized")
31
+
32
+ def add_result(
33
+ self,
34
+ plugin_name: str,
35
+ result: Dict[str, Any],
36
+ merge: bool = False
37
+ ) -> None:
38
+ """
39
+ Add result from a plugin.
40
+
41
+ Args:
42
+ plugin_name: Name of the plugin
43
+ result: Result dictionary from plugin
44
+ merge: Whether to merge with existing results
45
+ """
46
+ if merge and plugin_name in self.results:
47
+ # Merge with existing results
48
+ self.results[plugin_name] = self._merge_dicts(
49
+ self.results[plugin_name],
50
+ result
51
+ )
52
+ else:
53
+ # Replace existing results
54
+ self.results[plugin_name] = result
55
+
56
+ logger.debug(f"Added result from plugin: {plugin_name}")
57
+
58
+ def add_metadata(self, metadata: Dict[str, Any]) -> None:
59
+ """
60
+ Add metadata to results.
61
+
62
+ Args:
63
+ metadata: Metadata dictionary
64
+ """
65
+ self.metadata.update(metadata)
66
+ logger.debug(f"Added metadata: {list(metadata.keys())}")
67
+
68
+ def set_file_info(
69
+ self,
70
+ filename: str,
71
+ file_type: str,
72
+ file_size: int,
73
+ **kwargs
74
+ ) -> None:
75
+ """
76
+ Set file information in metadata.
77
+
78
+ Args:
79
+ filename: Name of the file
80
+ file_type: Type of file (image/video)
81
+ file_size: Size of file in bytes
82
+ **kwargs: Additional file information
83
+ """
84
+ self.metadata["file"] = {
85
+ "filename": filename,
86
+ "type": file_type,
87
+ "size": file_size,
88
+ "size_mb": round(file_size / 1024 / 1024, 2),
89
+ **kwargs
90
+ }
91
+
92
+ def set_processing_info(
93
+ self,
94
+ start_time: datetime,
95
+ end_time: datetime,
96
+ plugins_used: List[str]
97
+ ) -> None:
98
+ """
99
+ Set processing information in metadata.
100
+
101
+ Args:
102
+ start_time: Processing start time
103
+ end_time: Processing end time
104
+ plugins_used: List of plugin names used
105
+ """
106
+ duration = (end_time - start_time).total_seconds()
107
+
108
+ self.metadata["processing"] = {
109
+ "start_time": start_time.isoformat(),
110
+ "end_time": end_time.isoformat(),
111
+ "duration_seconds": round(duration, 3),
112
+ "plugins_used": plugins_used,
113
+ "plugin_count": len(plugins_used),
114
+ }
115
+
116
+ def _merge_dicts(self, dict1: Dict, dict2: Dict) -> Dict:
117
+ """
118
+ Deep merge two dictionaries.
119
+
120
+ Args:
121
+ dict1: First dictionary
122
+ dict2: Second dictionary
123
+
124
+ Returns:
125
+ Merged dictionary
126
+ """
127
+ result = dict1.copy()
128
+
129
+ for key, value in dict2.items():
130
+ if key in result and isinstance(result[key], dict) and isinstance(value, dict):
131
+ result[key] = self._merge_dicts(result[key], value)
132
+ elif key in result and isinstance(result[key], list) and isinstance(value, list):
133
+ result[key].extend(value)
134
+ else:
135
+ result[key] = value
136
+
137
+ return result
138
+
139
+ def merge_results(self, results_list: List[Dict[str, Any]]) -> Dict[str, Any]:
140
+ """
141
+ Merge multiple result dictionaries.
142
+
143
+ Args:
144
+ results_list: List of result dictionaries
145
+
146
+ Returns:
147
+ Merged dictionary
148
+ """
149
+ merged = {}
150
+
151
+ for result in results_list:
152
+ merged = self._merge_dicts(merged, result)
153
+
154
+ return merged
155
+
156
+ def get_result(self, plugin_name: Optional[str] = None) -> Union[Dict, Any]:
157
+ """
158
+ Get result from specific plugin or all results.
159
+
160
+ Args:
161
+ plugin_name: Name of plugin (None for all results)
162
+
163
+ Returns:
164
+ Result dictionary or specific plugin result
165
+ """
166
+ if plugin_name is None:
167
+ return self.results
168
+
169
+ return self.results.get(plugin_name)
170
+
171
+ def to_dict(self, include_metadata: bool = True) -> Dict[str, Any]:
172
+ """
173
+ Convert results to dictionary.
174
+
175
+ Args:
176
+ include_metadata: Whether to include metadata
177
+
178
+ Returns:
179
+ Complete results dictionary
180
+ """
181
+ output = {
182
+ "results": self.results,
183
+ }
184
+
185
+ if include_metadata and self.metadata:
186
+ output["metadata"] = self.metadata
187
+
188
+ # Add timestamp if not present
189
+ if "timestamp" not in output.get("metadata", {}):
190
+ if "metadata" not in output:
191
+ output["metadata"] = {}
192
+ output["metadata"]["timestamp"] = datetime.now().isoformat()
193
+
194
+ # Add version
195
+ output["metadata"]["version"] = config.APP_VERSION
196
+
197
+ return output
198
+
199
+ def to_json(
200
+ self,
201
+ include_metadata: bool = True,
202
+ pretty: bool = None,
203
+ ensure_ascii: bool = False
204
+ ) -> str:
205
+ """
206
+ Convert results to JSON string.
207
+
208
+ Args:
209
+ include_metadata: Whether to include metadata
210
+ pretty: Whether to format JSON (None uses config)
211
+ ensure_ascii: Whether to escape non-ASCII characters
212
+
213
+ Returns:
214
+ JSON string
215
+ """
216
+ if pretty is None:
217
+ pretty = config.PRETTY_JSON
218
+
219
+ data = self.to_dict(include_metadata=include_metadata)
220
+
221
+ if pretty:
222
+ json_str = json.dumps(
223
+ data,
224
+ indent=2,
225
+ ensure_ascii=ensure_ascii,
226
+ default=str
227
+ )
228
+ else:
229
+ json_str = json.dumps(
230
+ data,
231
+ ensure_ascii=ensure_ascii,
232
+ default=str
233
+ )
234
+
235
+ return json_str
236
+
237
+ def save_json(
238
+ self,
239
+ output_path: Union[str, Path],
240
+ include_metadata: bool = True,
241
+ pretty: bool = None
242
+ ) -> None:
243
+ """
244
+ Save results to JSON file.
245
+
246
+ Args:
247
+ output_path: Path to output file
248
+ include_metadata: Whether to include metadata
249
+ pretty: Whether to format JSON
250
+ """
251
+ try:
252
+ output_path = Path(output_path)
253
+ output_path.parent.mkdir(parents=True, exist_ok=True)
254
+
255
+ json_str = self.to_json(
256
+ include_metadata=include_metadata,
257
+ pretty=pretty
258
+ )
259
+
260
+ output_path.write_text(json_str, encoding="utf-8")
261
+
262
+ logger.info(f"Saved results to: {output_path}")
263
+
264
+ except Exception as e:
265
+ logger.error(f"Failed to save JSON: {e}")
266
+ raise ResultError(
267
+ f"Cannot save results to file: {str(e)}",
268
+ {"path": str(output_path), "error": str(e)}
269
+ )
270
+
271
+ def generate_prompt(self) -> str:
272
+ """
273
+ Generate a text prompt from results.
274
+
275
+ Returns:
276
+ Generated prompt string
277
+ """
278
+ prompt_parts = []
279
+
280
+ # Add captions
281
+ if "caption_generator" in self.results:
282
+ caption = self.results["caption_generator"].get("caption", "")
283
+ if caption:
284
+ prompt_parts.append(caption)
285
+
286
+ # Add objects
287
+ if "object_detector" in self.results:
288
+ objects = self.results["object_detector"].get("objects", [])
289
+ if objects:
290
+ object_names = [obj["name"] for obj in objects[:5]]
291
+ prompt_parts.append(f"showing {', '.join(object_names)}")
292
+
293
+ # Add colors
294
+ if "color_analyzer" in self.results:
295
+ colors = self.results["color_analyzer"].get("dominant_colors", [])
296
+ if colors:
297
+ color_names = [c["name"] for c in colors[:3]]
298
+ prompt_parts.append(f"with {', '.join(color_names)} colors")
299
+
300
+ # Add text
301
+ if "text_extractor" in self.results:
302
+ text = self.results["text_extractor"].get("text", "")
303
+ if text:
304
+ prompt_parts.append(f'containing text "{text[:50]}"')
305
+
306
+ prompt = ", ".join(prompt_parts)
307
+
308
+ return prompt.capitalize() if prompt else "No description available"
309
+
310
+ def get_summary(self) -> Dict[str, Any]:
311
+ """
312
+ Get summary of results.
313
+
314
+ Returns:
315
+ Summary dictionary
316
+ """
317
+ summary = {
318
+ "total_plugins": len(self.results),
319
+ "plugins": list(self.results.keys()),
320
+ }
321
+
322
+ # Add plugin-specific summaries
323
+ for plugin_name, result in self.results.items():
324
+ if plugin_name == "object_detector":
325
+ summary["object_count"] = len(result.get("objects", []))
326
+ elif plugin_name == "caption_generator":
327
+ summary["has_caption"] = bool(result.get("caption"))
328
+ elif plugin_name == "color_analyzer":
329
+ summary["color_count"] = len(result.get("dominant_colors", []))
330
+ elif plugin_name == "text_extractor":
331
+ summary["has_text"] = bool(result.get("text"))
332
+
333
+ return summary
334
+
335
+ def clear(self) -> None:
336
+ """Clear all results and metadata."""
337
+ self.results.clear()
338
+ self.metadata.clear()
339
+ logger.debug("Cleared all results")
340
+
341
+ def __str__(self) -> str:
342
+ """String representation."""
343
+ return self.to_json(pretty=True)
344
+
345
+ def __repr__(self) -> str:
346
+ """Object representation."""
347
+ return f"ResultManager(plugins={len(self.results)}, metadata={len(self.metadata)})"
core/video_processor.py ADDED
@@ -0,0 +1,333 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Video Processor Module
3
+
4
+ Handles all video processing operations including frame extraction,
5
+ validation, and video metadata extraction.
6
+ """
7
+
8
+ import subprocess
9
+ from pathlib import Path
10
+ from typing import List, Optional, Union, Tuple
11
+ import cv2
12
+ import magic
13
+ from loguru import logger
14
+
15
+ from core.config import config
16
+ from core.exceptions import (
17
+ VideoProcessingError,
18
+ InvalidFileError,
19
+ FileSizeError,
20
+ UnsupportedFormatError,
21
+ FrameExtractionError,
22
+ )
23
+ from core.image_processor import ImageProcessor
24
+
25
+
26
+ class VideoProcessor:
27
+ """
28
+ Process videos for analysis.
29
+
30
+ Handles validation, frame extraction, and metadata extraction
31
+ for videos before they are analyzed.
32
+ """
33
+
34
+ def __init__(self):
35
+ """Initialize VideoProcessor."""
36
+ self.max_size = config.MAX_VIDEO_SIZE
37
+ self.allowed_formats = config.ALLOWED_VIDEO_FORMATS
38
+ self.fps_extraction = config.VIDEO_FPS_EXTRACTION
39
+ self.max_frames = config.MAX_FRAMES_PER_VIDEO
40
+ self.image_processor = ImageProcessor()
41
+ logger.info("VideoProcessor initialized")
42
+
43
+ def validate_video(self, video_path: Path) -> bool:
44
+ """
45
+ Validate video file.
46
+
47
+ Args:
48
+ video_path: Path to video file
49
+
50
+ Returns:
51
+ True if valid
52
+
53
+ Raises:
54
+ FileSizeError: If file too large
55
+ UnsupportedFormatError: If format not supported
56
+ InvalidFileError: If file is corrupted
57
+ """
58
+ # Check file exists
59
+ if not video_path.exists():
60
+ raise InvalidFileError(
61
+ f"Video file not found: {video_path}",
62
+ {"path": str(video_path)}
63
+ )
64
+
65
+ # Check file size
66
+ file_size = video_path.stat().st_size
67
+ if file_size > self.max_size:
68
+ raise FileSizeError(
69
+ f"Video too large: {file_size / 1024 / 1024:.1f}MB",
70
+ {"max_size": self.max_size, "actual_size": file_size}
71
+ )
72
+
73
+ # Check file extension
74
+ ext = video_path.suffix.lower()
75
+ if ext not in self.allowed_formats:
76
+ raise UnsupportedFormatError(
77
+ f"Unsupported video format: {ext}",
78
+ {"allowed": self.allowed_formats, "received": ext}
79
+ )
80
+
81
+ # Check MIME type using magic bytes
82
+ try:
83
+ mime = magic.from_file(str(video_path), mime=True)
84
+ if not mime.startswith("video/"):
85
+ raise InvalidFileError(
86
+ f"File is not a valid video: {mime}",
87
+ {"mime_type": mime}
88
+ )
89
+ except Exception as e:
90
+ logger.warning(f"Could not verify MIME type: {e}")
91
+
92
+ return True
93
+
94
+ def get_video_info(self, video_path: Union[str, Path]) -> dict:
95
+ """
96
+ Get video metadata using OpenCV.
97
+
98
+ Args:
99
+ video_path: Path to video file
100
+
101
+ Returns:
102
+ Dictionary with video information
103
+ """
104
+ video_path = Path(video_path)
105
+ self.validate_video(video_path)
106
+
107
+ try:
108
+ cap = cv2.VideoCapture(str(video_path))
109
+
110
+ if not cap.isOpened():
111
+ raise InvalidFileError(
112
+ "Cannot open video file",
113
+ {"path": str(video_path)}
114
+ )
115
+
116
+ # Extract metadata
117
+ fps = cap.get(cv2.CAP_PROP_FPS)
118
+ frame_count = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
119
+ width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
120
+ height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
121
+ duration = frame_count / fps if fps > 0 else 0
122
+
123
+ cap.release()
124
+
125
+ info = {
126
+ "filename": video_path.name,
127
+ "fps": fps,
128
+ "frame_count": frame_count,
129
+ "width": width,
130
+ "height": height,
131
+ "duration": duration,
132
+ "file_size": video_path.stat().st_size,
133
+ }
134
+
135
+ logger.info(f"Video info: {video_path.name} - {width}x{height}, "
136
+ f"{fps:.2f}fps, {duration:.2f}s")
137
+
138
+ return info
139
+
140
+ except Exception as e:
141
+ logger.error(f"Failed to get video info: {e}")
142
+ raise VideoProcessingError(
143
+ f"Cannot extract video metadata: {str(e)}",
144
+ {"path": str(video_path), "error": str(e)}
145
+ )
146
+
147
+ def extract_frames(
148
+ self,
149
+ video_path: Union[str, Path],
150
+ fps: Optional[float] = None,
151
+ max_frames: Optional[int] = None,
152
+ output_dir: Optional[Path] = None
153
+ ) -> List[Path]:
154
+ """
155
+ Extract frames from video at specified FPS.
156
+
157
+ Args:
158
+ video_path: Path to video file
159
+ fps: Frames per second to extract (default: config.VIDEO_FPS_EXTRACTION)
160
+ max_frames: Maximum number of frames to extract
161
+ output_dir: Directory to save frames (default: cache directory)
162
+
163
+ Returns:
164
+ List of paths to extracted frames
165
+
166
+ Raises:
167
+ FrameExtractionError: If frame extraction fails
168
+ """
169
+ video_path = Path(video_path)
170
+ self.validate_video(video_path)
171
+
172
+ if fps is None:
173
+ fps = self.fps_extraction
174
+
175
+ if max_frames is None:
176
+ max_frames = self.max_frames
177
+
178
+ if output_dir is None:
179
+ output_dir = config.CACHE_DIR / "frames" / video_path.stem
180
+
181
+ output_dir.mkdir(parents=True, exist_ok=True)
182
+
183
+ try:
184
+ cap = cv2.VideoCapture(str(video_path))
185
+
186
+ if not cap.isOpened():
187
+ raise FrameExtractionError(
188
+ "Cannot open video file",
189
+ {"path": str(video_path)}
190
+ )
191
+
192
+ video_fps = cap.get(cv2.CAP_PROP_FPS)
193
+ frame_count = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
194
+
195
+ # Calculate frame interval
196
+ frame_interval = int(video_fps / fps) if fps < video_fps else 1
197
+
198
+ frames_saved = []
199
+ frame_idx = 0
200
+ saved_count = 0
201
+
202
+ logger.info(f"Extracting frames from {video_path.name} "
203
+ f"(fps={fps}, interval={frame_interval})")
204
+
205
+ while True:
206
+ ret, frame = cap.read()
207
+
208
+ if not ret:
209
+ break
210
+
211
+ # Extract frame at specified interval
212
+ if frame_idx % frame_interval == 0:
213
+ # Save frame
214
+ frame_path = output_dir / f"frame_{saved_count:04d}.jpg"
215
+ cv2.imwrite(str(frame_path), frame)
216
+ frames_saved.append(frame_path)
217
+ saved_count += 1
218
+
219
+ # Check if we've reached max frames
220
+ if saved_count >= max_frames:
221
+ logger.info(f"Reached max frames limit: {max_frames}")
222
+ break
223
+
224
+ frame_idx += 1
225
+
226
+ cap.release()
227
+
228
+ logger.info(f"Extracted {len(frames_saved)} frames from {video_path.name}")
229
+
230
+ return frames_saved
231
+
232
+ except Exception as e:
233
+ logger.error(f"Frame extraction failed: {e}")
234
+ raise FrameExtractionError(
235
+ f"Failed to extract frames: {str(e)}",
236
+ {"path": str(video_path), "error": str(e)}
237
+ )
238
+
239
+ def extract_key_frames(
240
+ self,
241
+ video_path: Union[str, Path],
242
+ num_frames: int = 5,
243
+ output_dir: Optional[Path] = None
244
+ ) -> List[Path]:
245
+ """
246
+ Extract evenly distributed key frames from video.
247
+
248
+ Args:
249
+ video_path: Path to video file
250
+ num_frames: Number of key frames to extract
251
+ output_dir: Directory to save frames
252
+
253
+ Returns:
254
+ List of paths to extracted frames
255
+ """
256
+ video_path = Path(video_path)
257
+ self.validate_video(video_path)
258
+
259
+ if output_dir is None:
260
+ output_dir = config.CACHE_DIR / "keyframes" / video_path.stem
261
+
262
+ output_dir.mkdir(parents=True, exist_ok=True)
263
+
264
+ try:
265
+ cap = cv2.VideoCapture(str(video_path))
266
+
267
+ if not cap.isOpened():
268
+ raise FrameExtractionError(
269
+ "Cannot open video file",
270
+ {"path": str(video_path)}
271
+ )
272
+
273
+ frame_count = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
274
+
275
+ # Calculate frame positions
276
+ positions = [int(i * frame_count / (num_frames + 1))
277
+ for i in range(1, num_frames + 1)]
278
+
279
+ frames_saved = []
280
+
281
+ for idx, pos in enumerate(positions):
282
+ cap.set(cv2.CAP_PROP_POS_FRAMES, pos)
283
+ ret, frame = cap.read()
284
+
285
+ if ret:
286
+ frame_path = output_dir / f"keyframe_{idx:02d}.jpg"
287
+ cv2.imwrite(str(frame_path), frame)
288
+ frames_saved.append(frame_path)
289
+
290
+ cap.release()
291
+
292
+ logger.info(f"Extracted {len(frames_saved)} key frames from {video_path.name}")
293
+
294
+ return frames_saved
295
+
296
+ except Exception as e:
297
+ logger.error(f"Key frame extraction failed: {e}")
298
+ raise FrameExtractionError(
299
+ f"Failed to extract key frames: {str(e)}",
300
+ {"path": str(video_path), "error": str(e)}
301
+ )
302
+
303
+ def process(
304
+ self,
305
+ video_path: Union[str, Path],
306
+ extract_method: str = "fps",
307
+ **kwargs
308
+ ) -> List[Path]:
309
+ """
310
+ Complete video processing pipeline.
311
+
312
+ Args:
313
+ video_path: Path to video file
314
+ extract_method: Method for frame extraction ("fps" or "keyframes")
315
+ **kwargs: Additional arguments for extraction method
316
+
317
+ Returns:
318
+ List of extracted frame paths
319
+ """
320
+ try:
321
+ if extract_method == "fps":
322
+ return self.extract_frames(video_path, **kwargs)
323
+ elif extract_method == "keyframes":
324
+ return self.extract_key_frames(video_path, **kwargs)
325
+ else:
326
+ raise ValueError(f"Unknown extraction method: {extract_method}")
327
+
328
+ except Exception as e:
329
+ logger.error(f"Video processing failed: {e}")
330
+ raise VideoProcessingError(
331
+ f"Failed to process video: {str(e)}",
332
+ {"path": str(video_path), "error": str(e)}
333
+ )
plugins/__init__.py ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ DeepVision Plugins Package
3
+
4
+ Plugin system for modular analysis capabilities.
5
+ """
6
+
7
+ __version__ = "0.1.0"
8
+
9
+ from plugins.base import BasePlugin, PluginMetadata
10
+ from plugins.loader import PluginLoader
11
+
12
+ __all__ = [
13
+ "BasePlugin",
14
+ "PluginMetadata",
15
+ "PluginLoader",
16
+ ]
plugins/base.py ADDED
@@ -0,0 +1,170 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Base Plugin Class
3
+
4
+ Defines the interface that all plugins must implement.
5
+ """
6
+
7
+ from abc import ABC, abstractmethod
8
+ from dataclasses import dataclass
9
+ from typing import Dict, Any, Optional, List
10
+ from pathlib import Path
11
+ from PIL import Image
12
+ import numpy as np
13
+ from loguru import logger
14
+
15
+
16
+ @dataclass
17
+ class PluginMetadata:
18
+ """Metadata for a plugin."""
19
+
20
+ name: str
21
+ version: str
22
+ description: str
23
+ author: str
24
+ requires: List[str] = None # Required dependencies
25
+ category: str = "general" # Plugin category
26
+ enabled: bool = True
27
+ priority: int = 50 # Execution priority (lower = earlier)
28
+
29
+ def __post_init__(self):
30
+ if self.requires is None:
31
+ self.requires = []
32
+
33
+
34
+ class BasePlugin(ABC):
35
+ """
36
+ Base class for all DeepVision plugins.
37
+
38
+ All plugins must inherit from this class and implement
39
+ the analyze() method.
40
+ """
41
+
42
+ def __init__(self):
43
+ """Initialize plugin."""
44
+ self._metadata: Optional[PluginMetadata] = None
45
+ self._initialized = False
46
+ self._enabled = True
47
+ logger.debug(f"Plugin {self.__class__.__name__} created")
48
+
49
+ @property
50
+ @abstractmethod
51
+ def metadata(self) -> PluginMetadata:
52
+ """
53
+ Return plugin metadata.
54
+
55
+ Returns:
56
+ PluginMetadata instance
57
+ """
58
+ pass
59
+
60
+ @abstractmethod
61
+ def initialize(self) -> None:
62
+ """
63
+ Initialize the plugin.
64
+
65
+ This method is called when the plugin is loaded.
66
+ Use it to load models, initialize resources, etc.
67
+ """
68
+ pass
69
+
70
+ @abstractmethod
71
+ def analyze(
72
+ self,
73
+ media: Any,
74
+ media_path: Path
75
+ ) -> Dict[str, Any]:
76
+ """
77
+ Analyze media and return results.
78
+
79
+ Args:
80
+ media: Processed media (PIL Image or numpy array)
81
+ media_path: Path to the media file
82
+
83
+ Returns:
84
+ Dictionary with analysis results
85
+ """
86
+ pass
87
+
88
+ def cleanup(self) -> None:
89
+ """
90
+ Clean up resources when plugin is unloaded.
91
+
92
+ Override this method to release resources like
93
+ model memory, file handles, etc.
94
+ """
95
+ logger.debug(f"Cleaning up plugin {self.metadata.name}")
96
+
97
+ def validate_input(self, media: Any) -> bool:
98
+ """
99
+ Validate input media.
100
+
101
+ Args:
102
+ media: Media to validate
103
+
104
+ Returns:
105
+ True if valid, False otherwise
106
+ """
107
+ if isinstance(media, Image.Image):
108
+ return True
109
+ elif isinstance(media, np.ndarray):
110
+ return True
111
+ else:
112
+ logger.warning(
113
+ f"Plugin {self.metadata.name} received unsupported media type: "
114
+ f"{type(media)}"
115
+ )
116
+ return False
117
+
118
+ def get_config(self) -> Dict[str, Any]:
119
+ """
120
+ Get plugin configuration.
121
+
122
+ Returns:
123
+ Configuration dictionary
124
+ """
125
+ return {
126
+ "name": self.metadata.name,
127
+ "version": self.metadata.version,
128
+ "enabled": self._enabled,
129
+ "initialized": self._initialized,
130
+ }
131
+
132
+ def set_enabled(self, enabled: bool) -> None:
133
+ """
134
+ Enable or disable the plugin.
135
+
136
+ Args:
137
+ enabled: True to enable, False to disable
138
+ """
139
+ self._enabled = enabled
140
+ logger.info(
141
+ f"Plugin {self.metadata.name} "
142
+ f"{'enabled' if enabled else 'disabled'}"
143
+ )
144
+
145
+ def is_enabled(self) -> bool:
146
+ """
147
+ Check if plugin is enabled.
148
+
149
+ Returns:
150
+ True if enabled
151
+ """
152
+ return self._enabled
153
+
154
+ def is_initialized(self) -> bool:
155
+ """
156
+ Check if plugin is initialized.
157
+
158
+ Returns:
159
+ True if initialized
160
+ """
161
+ return self._initialized
162
+
163
+ def __repr__(self) -> str:
164
+ """String representation."""
165
+ return (
166
+ f"{self.__class__.__name__}("
167
+ f"name={self.metadata.name}, "
168
+ f"version={self.metadata.version}, "
169
+ f"enabled={self._enabled})"
170
+ )
plugins/caption_generator.py ADDED
@@ -0,0 +1,206 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Caption Generator Plugin
3
+
4
+ Generates descriptive captions for images using BLIP-2.
5
+ """
6
+
7
+ from typing import Dict, Any
8
+ from pathlib import Path
9
+ import numpy as np
10
+ from PIL import Image
11
+ from loguru import logger
12
+
13
+ from plugins.base import BasePlugin, PluginMetadata
14
+
15
+
16
+ class CaptionGeneratorPlugin(BasePlugin):
17
+ """
18
+ Generate captions for images using BLIP-2.
19
+
20
+ Creates natural language descriptions of image content.
21
+ """
22
+
23
+ def __init__(self):
24
+ """Initialize CaptionGeneratorPlugin."""
25
+ super().__init__()
26
+ self.model = None
27
+ self.processor = None
28
+ self.max_length = 50
29
+
30
+ @property
31
+ def metadata(self) -> PluginMetadata:
32
+ """Return plugin metadata."""
33
+ return PluginMetadata(
34
+ name="caption_generator",
35
+ version="0.1.0",
36
+ description="Generates image captions using BLIP-2",
37
+ author="AI Dev Collective",
38
+ requires=["transformers", "torch"],
39
+ category="captioning",
40
+ priority=20,
41
+ )
42
+
43
+ def initialize(self) -> None:
44
+ """Initialize the plugin and load BLIP-2 model."""
45
+ try:
46
+ # Import here to avoid loading if plugin is not used
47
+ from transformers import (
48
+ Blip2Processor,
49
+ Blip2ForConditionalGeneration
50
+ )
51
+
52
+ logger.info("Loading BLIP-2 model...")
53
+
54
+ # Use smaller BLIP-2 model for faster inference
55
+ model_name = "Salesforce/blip2-opt-2.7b"
56
+
57
+ # Load processor and model
58
+ self.processor = Blip2Processor.from_pretrained(model_name)
59
+ self.model = Blip2ForConditionalGeneration.from_pretrained(
60
+ model_name
61
+ )
62
+
63
+ # Set to eval mode
64
+ self.model.eval()
65
+
66
+ # Move to CPU (GPU support can be added later)
67
+ device = "cpu"
68
+ self.model.to(device)
69
+
70
+ self._initialized = True
71
+
72
+ logger.info(
73
+ f"BLIP-2 model loaded successfully on {device}"
74
+ )
75
+
76
+ except Exception as e:
77
+ logger.error(f"Failed to initialize CaptionGeneratorPlugin: {e}")
78
+ # Fallback: try smaller BLIP model
79
+ try:
80
+ logger.info("Trying smaller BLIP model...")
81
+ from transformers import BlipProcessor, BlipForConditionalGeneration
82
+
83
+ model_name = "Salesforce/blip-image-captioning-base"
84
+ self.processor = BlipProcessor.from_pretrained(model_name)
85
+ self.model = BlipForConditionalGeneration.from_pretrained(
86
+ model_name
87
+ )
88
+ self.model.eval()
89
+ self.model.to("cpu")
90
+ self._initialized = True
91
+
92
+ logger.info("BLIP base model loaded successfully")
93
+
94
+ except Exception as fallback_error:
95
+ logger.error(f"Fallback also failed: {fallback_error}")
96
+ raise
97
+
98
+ def _generate_caption(
99
+ self,
100
+ image: Image.Image,
101
+ max_length: int = 50
102
+ ) -> str:
103
+ """
104
+ Generate caption for image.
105
+
106
+ Args:
107
+ image: PIL Image
108
+ max_length: Maximum caption length
109
+
110
+ Returns:
111
+ Generated caption string
112
+ """
113
+ import torch
114
+
115
+ # Prepare inputs
116
+ inputs = self.processor(
117
+ images=image,
118
+ return_tensors="pt"
119
+ )
120
+
121
+ # Generate caption
122
+ with torch.no_grad():
123
+ generated_ids = self.model.generate(
124
+ **inputs,
125
+ max_length=max_length,
126
+ num_beams=5,
127
+ early_stopping=True
128
+ )
129
+
130
+ # Decode caption
131
+ caption = self.processor.decode(
132
+ generated_ids[0],
133
+ skip_special_tokens=True
134
+ )
135
+
136
+ return caption.strip()
137
+
138
+ def analyze(
139
+ self,
140
+ media: Any,
141
+ media_path: Path
142
+ ) -> Dict[str, Any]:
143
+ """
144
+ Generate caption for the image.
145
+
146
+ Args:
147
+ media: PIL Image or numpy array
148
+ media_path: Path to image file
149
+
150
+ Returns:
151
+ Dictionary with caption
152
+ """
153
+ try:
154
+ # Check if initialized
155
+ if not self._initialized:
156
+ self.initialize()
157
+
158
+ # Validate input
159
+ if not self.validate_input(media):
160
+ return {"error": "Invalid input type"}
161
+
162
+ # Convert to PIL Image if numpy array
163
+ if isinstance(media, np.ndarray):
164
+ image = Image.fromarray(
165
+ (media * 255).astype(np.uint8) if media.max() <= 1
166
+ else media.astype(np.uint8)
167
+ )
168
+ else:
169
+ image = media
170
+
171
+ # Generate caption
172
+ caption = self._generate_caption(image, self.max_length)
173
+
174
+ # Analyze caption
175
+ word_count = len(caption.split())
176
+
177
+ result = {
178
+ "caption": caption,
179
+ "word_count": word_count,
180
+ "character_count": len(caption),
181
+ "max_length": self.max_length,
182
+ "status": "success",
183
+ }
184
+
185
+ logger.debug(f"Caption generated: '{caption[:50]}...'")
186
+
187
+ return result
188
+
189
+ except Exception as e:
190
+ logger.error(f"Caption generation failed: {e}")
191
+ return {
192
+ "error": str(e),
193
+ "status": "failed"
194
+ }
195
+
196
+ def cleanup(self) -> None:
197
+ """Clean up model resources."""
198
+ if self.model is not None:
199
+ del self.model
200
+ self.model = None
201
+
202
+ if self.processor is not None:
203
+ del self.processor
204
+ self.processor = None
205
+
206
+ logger.info("CaptionGeneratorPlugin cleanup complete")
plugins/color_analyzer.py ADDED
@@ -0,0 +1,291 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Color Analyzer Plugin
3
+
4
+ Analyzes dominant colors in images.
5
+ """
6
+
7
+ from typing import Dict, Any
8
+ from pathlib import Path
9
+ from collections import Counter
10
+ import numpy as np
11
+ from PIL import Image
12
+ from loguru import logger
13
+
14
+ from plugins.base import BasePlugin, PluginMetadata
15
+
16
+
17
+ class ColorAnalyzerPlugin(BasePlugin):
18
+ """
19
+ Analyze dominant colors in an image.
20
+
21
+ Extracts the most prominent colors and provides
22
+ color information including RGB values and names.
23
+ """
24
+
25
+ def __init__(self):
26
+ """Initialize ColorAnalyzerPlugin."""
27
+ super().__init__()
28
+ self.num_colors = 5
29
+ self.color_names = self._load_color_names()
30
+
31
+ @property
32
+ def metadata(self) -> PluginMetadata:
33
+ """Return plugin metadata."""
34
+ return PluginMetadata(
35
+ name="color_analyzer",
36
+ version="0.1.0",
37
+ description="Analyzes dominant colors in images",
38
+ author="AI Dev Collective",
39
+ category="analysis",
40
+ priority=30,
41
+ )
42
+
43
+ def initialize(self) -> None:
44
+ """Initialize the plugin."""
45
+ logger.info("ColorAnalyzerPlugin initialized")
46
+ self._initialized = True
47
+
48
+ def _load_color_names(self) -> Dict[str, tuple]:
49
+ """
50
+ Load basic color names and their RGB values.
51
+
52
+ Returns:
53
+ Dictionary mapping color names to RGB tuples
54
+ """
55
+ return {
56
+ "red": (255, 0, 0),
57
+ "green": (0, 255, 0),
58
+ "blue": (0, 0, 255),
59
+ "yellow": (255, 255, 0),
60
+ "cyan": (0, 255, 255),
61
+ "magenta": (255, 0, 255),
62
+ "white": (255, 255, 255),
63
+ "black": (0, 0, 0),
64
+ "gray": (128, 128, 128),
65
+ "orange": (255, 165, 0),
66
+ "purple": (128, 0, 128),
67
+ "pink": (255, 192, 203),
68
+ "brown": (165, 42, 42),
69
+ "navy": (0, 0, 128),
70
+ "teal": (0, 128, 128),
71
+ }
72
+
73
+ def _get_color_name(self, rgb: tuple) -> str:
74
+ """
75
+ Get the closest color name for an RGB value.
76
+
77
+ Args:
78
+ rgb: RGB tuple (r, g, b)
79
+
80
+ Returns:
81
+ Color name string
82
+ """
83
+ min_distance = float('inf')
84
+ closest_name = "unknown"
85
+
86
+ r, g, b = rgb
87
+
88
+ for name, (cr, cg, cb) in self.color_names.items():
89
+ # Calculate Euclidean distance
90
+ distance = np.sqrt(
91
+ (r - cr) ** 2 + (g - cg) ** 2 + (b - cb) ** 2
92
+ )
93
+
94
+ if distance < min_distance:
95
+ min_distance = distance
96
+ closest_name = name
97
+
98
+ return closest_name
99
+
100
+ def _extract_dominant_colors(
101
+ self,
102
+ image: Image.Image,
103
+ num_colors: int = 5
104
+ ) -> list:
105
+ """
106
+ Extract dominant colors from image.
107
+
108
+ Args:
109
+ image: PIL Image
110
+ num_colors: Number of dominant colors to extract
111
+
112
+ Returns:
113
+ List of color information dictionaries
114
+ """
115
+ # Resize image for faster processing
116
+ img = image.copy()
117
+ img.thumbnail((150, 150))
118
+
119
+ # Convert to RGB if necessary
120
+ if img.mode != 'RGB':
121
+ img = img.convert('RGB')
122
+
123
+ # Get all pixels
124
+ pixels = np.array(img).reshape(-1, 3)
125
+
126
+ # Count color occurrences
127
+ pixel_tuples = [tuple(pixel) for pixel in pixels]
128
+ color_counts = Counter(pixel_tuples)
129
+
130
+ # Get most common colors
131
+ most_common = color_counts.most_common(num_colors)
132
+
133
+ total_pixels = len(pixel_tuples)
134
+
135
+ colors = []
136
+ for rgb, count in most_common:
137
+ percentage = (count / total_pixels) * 100
138
+
139
+ color_info = {
140
+ "rgb": [int(rgb[0]), int(rgb[1]), int(rgb[2])],
141
+ "hex": f"#{rgb[0]:02x}{rgb[1]:02x}{rgb[2]:02x}",
142
+ "name": self._get_color_name(rgb),
143
+ "percentage": float(round(percentage, 2)),
144
+ "count": int(count),
145
+ }
146
+ colors.append(color_info)
147
+
148
+ return colors
149
+
150
+ def _calculate_brightness(self, rgb: tuple) -> float:
151
+ """
152
+ Calculate brightness of a color.
153
+
154
+ Args:
155
+ rgb: RGB tuple
156
+
157
+ Returns:
158
+ Brightness value (0-255)
159
+ """
160
+ r, g, b = rgb
161
+ # Perceived brightness formula
162
+ return (0.299 * r + 0.587 * g + 0.114 * b)
163
+
164
+ def _calculate_saturation(self, rgb: tuple) -> float:
165
+ """
166
+ Calculate saturation of a color.
167
+
168
+ Args:
169
+ rgb: RGB tuple
170
+
171
+ Returns:
172
+ Saturation value (0-1)
173
+ """
174
+ r, g, b = [x / 255.0 for x in rgb]
175
+ max_val = max(r, g, b)
176
+ min_val = min(r, g, b)
177
+
178
+ if max_val == 0:
179
+ return 0
180
+
181
+ return (max_val - min_val) / max_val
182
+
183
+ def analyze(
184
+ self,
185
+ media: Any,
186
+ media_path: Path
187
+ ) -> Dict[str, Any]:
188
+ """
189
+ Analyze colors in the image.
190
+
191
+ Args:
192
+ media: PIL Image or numpy array
193
+ media_path: Path to image file
194
+
195
+ Returns:
196
+ Dictionary with color analysis results
197
+ """
198
+ try:
199
+ # Validate input
200
+ if not self.validate_input(media):
201
+ return {"error": "Invalid input type"}
202
+
203
+ # Convert to PIL Image if numpy array
204
+ if isinstance(media, np.ndarray):
205
+ image = Image.fromarray(
206
+ (media * 255).astype(np.uint8) if media.max() <= 1
207
+ else media.astype(np.uint8)
208
+ )
209
+ else:
210
+ image = media
211
+
212
+ # Extract dominant colors
213
+ dominant_colors = self._extract_dominant_colors(
214
+ image,
215
+ num_colors=self.num_colors
216
+ )
217
+
218
+ # Calculate average brightness and saturation
219
+ avg_brightness = np.mean([
220
+ self._calculate_brightness(tuple(c["rgb"]))
221
+ for c in dominant_colors
222
+ ])
223
+
224
+ avg_saturation = np.mean([
225
+ self._calculate_saturation(tuple(c["rgb"]))
226
+ for c in dominant_colors
227
+ ])
228
+
229
+ # Determine overall color scheme
230
+ color_scheme = self._determine_color_scheme(dominant_colors)
231
+
232
+ result = {
233
+ "dominant_colors": dominant_colors,
234
+ "total_colors_analyzed": int(len(dominant_colors)),
235
+ "average_brightness": float(round(avg_brightness, 2)),
236
+ "average_saturation": float(round(avg_saturation, 2)),
237
+ "color_scheme": color_scheme,
238
+ "status": "success",
239
+ }
240
+
241
+ logger.debug(
242
+ f"Color analysis complete: {len(dominant_colors)} colors found"
243
+ )
244
+
245
+ return result
246
+
247
+ except Exception as e:
248
+ logger.error(f"Color analysis failed: {e}")
249
+ return {
250
+ "error": str(e),
251
+ "status": "failed"
252
+ }
253
+
254
+ def _determine_color_scheme(self, colors: list) -> str:
255
+ """
256
+ Determine the overall color scheme.
257
+
258
+ Args:
259
+ colors: List of color dictionaries
260
+
261
+ Returns:
262
+ Color scheme description
263
+ """
264
+ if not colors:
265
+ return "unknown"
266
+
267
+ # Get color names
268
+ color_names = [c["name"] for c in colors]
269
+
270
+ # Check for monochrome (mostly gray/white/black)
271
+ grayscale = ["gray", "white", "black"]
272
+ if all(name in grayscale for name in color_names[:3]):
273
+ return "monochrome"
274
+
275
+ # Check for warm colors
276
+ warm = ["red", "orange", "yellow", "pink", "brown"]
277
+ warm_count = sum(1 for name in color_names if name in warm)
278
+ if warm_count >= len(color_names) * 0.6:
279
+ return "warm"
280
+
281
+ # Check for cool colors
282
+ cool = ["blue", "green", "cyan", "purple", "teal", "navy"]
283
+ cool_count = sum(1 for name in color_names if name in cool)
284
+ if cool_count >= len(color_names) * 0.6:
285
+ return "cool"
286
+
287
+ return "mixed"
288
+
289
+ def cleanup(self) -> None:
290
+ """Clean up resources."""
291
+ logger.info("ColorAnalyzerPlugin cleanup complete")
plugins/loader.py ADDED
@@ -0,0 +1,318 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Plugin Loader
3
+
4
+ Dynamically loads and manages plugins.
5
+ """
6
+
7
+ import importlib
8
+ import inspect
9
+ from pathlib import Path
10
+ from typing import Dict, List, Type, Optional, Any
11
+ from loguru import logger
12
+
13
+ from plugins.base import BasePlugin, PluginMetadata
14
+ from core.exceptions import PluginError, PluginLoadError
15
+
16
+
17
+ class PluginLoader:
18
+ """
19
+ Load and manage plugins dynamically.
20
+
21
+ Discovers, loads, and manages plugin lifecycle.
22
+ """
23
+
24
+ def __init__(self, plugin_dir: Optional[Path] = None):
25
+ """
26
+ Initialize PluginLoader.
27
+
28
+ Args:
29
+ plugin_dir: Directory containing plugin modules
30
+ """
31
+ if plugin_dir is None:
32
+ plugin_dir = Path(__file__).parent
33
+
34
+ self.plugin_dir = Path(plugin_dir)
35
+ self.plugins: Dict[str, BasePlugin] = {}
36
+ self.plugin_classes: Dict[str, Type[BasePlugin]] = {}
37
+
38
+ logger.info(f"PluginLoader initialized with directory: {plugin_dir}")
39
+
40
+ def discover_plugins(self) -> List[str]:
41
+ """
42
+ Discover available plugins in the plugin directory.
43
+
44
+ Returns:
45
+ List of discovered plugin module names
46
+ """
47
+ discovered = []
48
+
49
+ # Look for Python files in the plugin directory
50
+ for file_path in self.plugin_dir.glob("*.py"):
51
+ # Skip __init__.py, base.py, loader.py
52
+ if file_path.stem in ["__init__", "base", "loader"]:
53
+ continue
54
+
55
+ module_name = file_path.stem
56
+ discovered.append(module_name)
57
+ logger.debug(f"Discovered plugin module: {module_name}")
58
+
59
+ logger.info(f"Discovered {len(discovered)} plugin modules")
60
+ return discovered
61
+
62
+ def load_plugin_class(self, module_name: str) -> Optional[Type[BasePlugin]]:
63
+ """
64
+ Load a plugin class from a module.
65
+
66
+ Args:
67
+ module_name: Name of the module to load
68
+
69
+ Returns:
70
+ Plugin class or None if not found
71
+ """
72
+ try:
73
+ # Import the module
74
+ module = importlib.import_module(f"plugins.{module_name}")
75
+
76
+ # Find all classes that inherit from BasePlugin
77
+ for name, obj in inspect.getmembers(module, inspect.isclass):
78
+ if (issubclass(obj, BasePlugin) and
79
+ obj is not BasePlugin and
80
+ obj.__module__ == module.__name__):
81
+
82
+ logger.info(f"Loaded plugin class: {name} from {module_name}")
83
+ return obj
84
+
85
+ logger.warning(f"No plugin class found in module: {module_name}")
86
+ return None
87
+
88
+ except Exception as e:
89
+ logger.error(f"Failed to load plugin module {module_name}: {e}")
90
+ raise PluginLoadError(
91
+ f"Cannot load plugin module {module_name}: {str(e)}",
92
+ {"module": module_name, "error": str(e)}
93
+ )
94
+
95
+ def load_plugin(
96
+ self,
97
+ plugin_name: str,
98
+ auto_initialize: bool = True
99
+ ) -> BasePlugin:
100
+ """
101
+ Load and optionally initialize a plugin.
102
+
103
+ Args:
104
+ plugin_name: Name of the plugin module
105
+ auto_initialize: Whether to automatically initialize the plugin
106
+
107
+ Returns:
108
+ Loaded plugin instance
109
+ """
110
+ try:
111
+ # Check if already loaded
112
+ if plugin_name in self.plugins:
113
+ logger.info(f"Plugin {plugin_name} already loaded")
114
+ return self.plugins[plugin_name]
115
+
116
+ # Load plugin class
117
+ plugin_class = self.load_plugin_class(plugin_name)
118
+
119
+ if plugin_class is None:
120
+ raise PluginLoadError(
121
+ f"No plugin class found in {plugin_name}",
122
+ {"plugin": plugin_name}
123
+ )
124
+
125
+ # Store plugin class
126
+ self.plugin_classes[plugin_name] = plugin_class
127
+
128
+ # Create instance
129
+ plugin_instance = plugin_class()
130
+
131
+ # Initialize if requested
132
+ if auto_initialize:
133
+ plugin_instance.initialize()
134
+ plugin_instance._initialized = True
135
+
136
+ # Store instance
137
+ self.plugins[plugin_instance.metadata.name] = plugin_instance
138
+
139
+ logger.info(
140
+ f"Plugin loaded: {plugin_instance.metadata.name} "
141
+ f"v{plugin_instance.metadata.version}"
142
+ )
143
+
144
+ return plugin_instance
145
+
146
+ except Exception as e:
147
+ logger.error(f"Failed to load plugin {plugin_name}: {e}")
148
+ raise PluginLoadError(
149
+ f"Cannot load plugin {plugin_name}: {str(e)}",
150
+ {"plugin": plugin_name, "error": str(e)}
151
+ )
152
+
153
+ def load_all_plugins(self, auto_initialize: bool = True) -> Dict[str, BasePlugin]:
154
+ """
155
+ Discover and load all available plugins.
156
+
157
+ Args:
158
+ auto_initialize: Whether to automatically initialize plugins
159
+
160
+ Returns:
161
+ Dictionary of loaded plugins
162
+ """
163
+ discovered = self.discover_plugins()
164
+
165
+ for module_name in discovered:
166
+ try:
167
+ self.load_plugin(module_name, auto_initialize=auto_initialize)
168
+ except Exception as e:
169
+ logger.error(f"Failed to load plugin {module_name}: {e}")
170
+ # Continue loading other plugins
171
+ continue
172
+
173
+ logger.info(f"Loaded {len(self.plugins)} plugins")
174
+ return self.plugins
175
+
176
+ def unload_plugin(self, plugin_name: str) -> None:
177
+ """
178
+ Unload a plugin and clean up resources.
179
+
180
+ Args:
181
+ plugin_name: Name of the plugin to unload
182
+ """
183
+ if plugin_name not in self.plugins:
184
+ logger.warning(f"Plugin {plugin_name} not loaded")
185
+ return
186
+
187
+ plugin = self.plugins[plugin_name]
188
+
189
+ # Clean up resources
190
+ try:
191
+ plugin.cleanup()
192
+ except Exception as e:
193
+ logger.error(f"Error during plugin cleanup: {e}")
194
+
195
+ # Remove from loaded plugins
196
+ del self.plugins[plugin_name]
197
+
198
+ logger.info(f"Plugin unloaded: {plugin_name}")
199
+
200
+ def unload_all_plugins(self) -> None:
201
+ """Unload all plugins."""
202
+ plugin_names = list(self.plugins.keys())
203
+
204
+ for plugin_name in plugin_names:
205
+ self.unload_plugin(plugin_name)
206
+
207
+ logger.info("All plugins unloaded")
208
+
209
+ def get_plugin(self, plugin_name: str) -> Optional[BasePlugin]:
210
+ """
211
+ Get a loaded plugin by name.
212
+
213
+ Args:
214
+ plugin_name: Name of the plugin
215
+
216
+ Returns:
217
+ Plugin instance or None
218
+ """
219
+ return self.plugins.get(plugin_name)
220
+
221
+ def list_plugins(self) -> List[PluginMetadata]:
222
+ """
223
+ List all loaded plugins.
224
+
225
+ Returns:
226
+ List of plugin metadata
227
+ """
228
+ return [plugin.metadata for plugin in self.plugins.values()]
229
+
230
+ def reload_plugin(self, plugin_name: str) -> BasePlugin:
231
+ """
232
+ Reload a plugin (unload then load).
233
+
234
+ Args:
235
+ plugin_name: Name of the plugin to reload
236
+
237
+ Returns:
238
+ Reloaded plugin instance
239
+ """
240
+ logger.info(f"Reloading plugin: {plugin_name}")
241
+
242
+ # Find the module name
243
+ module_name = None
244
+ for name, plugin in self.plugins.items():
245
+ if name == plugin_name:
246
+ module_name = plugin.__class__.__module__.split(".")[-1]
247
+ break
248
+
249
+ if module_name is None:
250
+ raise PluginError(
251
+ f"Plugin {plugin_name} not found",
252
+ {"plugin": plugin_name}
253
+ )
254
+
255
+ # Unload
256
+ self.unload_plugin(plugin_name)
257
+
258
+ # Reload module
259
+ importlib.reload(
260
+ importlib.import_module(f"plugins.{module_name}")
261
+ )
262
+
263
+ # Load again
264
+ return self.load_plugin(module_name)
265
+
266
+ def get_plugins_by_category(self, category: str) -> List[BasePlugin]:
267
+ """
268
+ Get all plugins in a specific category.
269
+
270
+ Args:
271
+ category: Plugin category
272
+
273
+ Returns:
274
+ List of plugins in the category
275
+ """
276
+ return [
277
+ plugin for plugin in self.plugins.values()
278
+ if plugin.metadata.category == category
279
+ ]
280
+
281
+ def get_enabled_plugins(self) -> List[BasePlugin]:
282
+ """
283
+ Get all enabled plugins.
284
+
285
+ Returns:
286
+ List of enabled plugins
287
+ """
288
+ return [
289
+ plugin for plugin in self.plugins.values()
290
+ if plugin.is_enabled()
291
+ ]
292
+
293
+ def get_plugin_info(self) -> Dict[str, Dict[str, Any]]:
294
+ """
295
+ Get information about all loaded plugins.
296
+
297
+ Returns:
298
+ Dictionary with plugin information
299
+ """
300
+ info = {}
301
+
302
+ for name, plugin in self.plugins.items():
303
+ info[name] = {
304
+ "name": plugin.metadata.name,
305
+ "version": plugin.metadata.version,
306
+ "description": plugin.metadata.description,
307
+ "author": plugin.metadata.author,
308
+ "category": plugin.metadata.category,
309
+ "enabled": plugin.is_enabled(),
310
+ "initialized": plugin.is_initialized(),
311
+ "priority": plugin.metadata.priority,
312
+ }
313
+
314
+ return info
315
+
316
+ def __repr__(self) -> str:
317
+ """String representation."""
318
+ return f"PluginLoader(plugins={len(self.plugins)})"
plugins/object_detector.py ADDED
@@ -0,0 +1,258 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Object Detector Plugin
3
+
4
+ Detects objects in images using CLIP model.
5
+ """
6
+
7
+ from typing import Dict, Any, List
8
+ from pathlib import Path
9
+ import numpy as np
10
+ from PIL import Image
11
+ from loguru import logger
12
+
13
+ from plugins.base import BasePlugin, PluginMetadata
14
+
15
+
16
+ class ObjectDetectorPlugin(BasePlugin):
17
+ """
18
+ Detect objects in images using CLIP.
19
+
20
+ Uses zero-shot classification to identify objects
21
+ without requiring training data.
22
+ """
23
+
24
+ def __init__(self):
25
+ """Initialize ObjectDetectorPlugin."""
26
+ super().__init__()
27
+ self.model = None
28
+ self.processor = None
29
+ self.candidate_labels = [
30
+ "person", "people", "man", "woman", "child", "baby",
31
+ "dog", "cat", "bird", "animal",
32
+ "car", "vehicle", "bicycle", "motorcycle",
33
+ "building", "house", "tree", "plant", "flower",
34
+ "food", "plate", "cup", "bottle",
35
+ "computer", "phone", "keyboard", "screen",
36
+ "furniture", "chair", "table", "bed",
37
+ "nature", "landscape", "mountain", "ocean", "beach",
38
+ "sky", "cloud", "sunset", "sunrise",
39
+ "indoor", "outdoor", "room", "street",
40
+ ]
41
+
42
+ @property
43
+ def metadata(self) -> PluginMetadata:
44
+ """Return plugin metadata."""
45
+ return PluginMetadata(
46
+ name="object_detector",
47
+ version="0.1.0",
48
+ description="Detects objects using CLIP zero-shot classification",
49
+ author="AI Dev Collective",
50
+ requires=["transformers", "torch"],
51
+ category="detection",
52
+ priority=10,
53
+ )
54
+
55
+ def initialize(self) -> None:
56
+ """Initialize the plugin and load CLIP model."""
57
+ try:
58
+ # Import here to avoid loading if plugin is not used
59
+ from transformers import CLIPProcessor, CLIPModel
60
+ import torch
61
+
62
+ logger.info("Loading CLIP model...")
63
+
64
+ model_name = "openai/clip-vit-base-patch32"
65
+
66
+ # Load model and processor
67
+ self.model = CLIPModel.from_pretrained(model_name)
68
+ self.processor = CLIPProcessor.from_pretrained(model_name)
69
+
70
+ # Set to eval mode
71
+ self.model.eval()
72
+
73
+ # Move to CPU (GPU support can be added later)
74
+ device = "cpu"
75
+ self.model.to(device)
76
+
77
+ self._initialized = True
78
+
79
+ logger.info(
80
+ f"CLIP model loaded successfully on {device}"
81
+ )
82
+
83
+ except Exception as e:
84
+ logger.error(f"Failed to initialize ObjectDetectorPlugin: {e}")
85
+ raise
86
+
87
+ def _detect_objects(
88
+ self,
89
+ image: Image.Image,
90
+ labels: List[str],
91
+ threshold: float = 0.3
92
+ ) -> List[Dict[str, Any]]:
93
+ """
94
+ Detect objects in image using CLIP.
95
+
96
+ Args:
97
+ image: PIL Image
98
+ labels: List of candidate labels
99
+ threshold: Confidence threshold
100
+
101
+ Returns:
102
+ List of detected objects
103
+ """
104
+ import torch
105
+
106
+ # Prepare inputs
107
+ inputs = self.processor(
108
+ text=labels,
109
+ images=image,
110
+ return_tensors="pt",
111
+ padding=True
112
+ )
113
+
114
+ # Get predictions
115
+ with torch.no_grad():
116
+ outputs = self.model(**inputs)
117
+ logits_per_image = outputs.logits_per_image
118
+ probs = logits_per_image.softmax(dim=1)[0]
119
+
120
+ # Filter by threshold and sort
121
+ detected = []
122
+ for idx, (label, prob) in enumerate(zip(labels, probs)):
123
+ confidence = float(prob)
124
+ if confidence >= threshold:
125
+ detected.append({
126
+ "name": label,
127
+ "confidence": round(confidence, 4),
128
+ "index": idx,
129
+ })
130
+
131
+ # Sort by confidence
132
+ detected.sort(key=lambda x: x["confidence"], reverse=True)
133
+
134
+ return detected
135
+
136
+ def analyze(
137
+ self,
138
+ media: Any,
139
+ media_path: Path
140
+ ) -> Dict[str, Any]:
141
+ """
142
+ Detect objects in the image.
143
+
144
+ Args:
145
+ media: PIL Image or numpy array
146
+ media_path: Path to image file
147
+
148
+ Returns:
149
+ Dictionary with detected objects
150
+ """
151
+ try:
152
+ # Check if initialized
153
+ if not self._initialized:
154
+ self.initialize()
155
+
156
+ # Validate input
157
+ if not self.validate_input(media):
158
+ return {"error": "Invalid input type"}
159
+
160
+ # Convert to PIL Image if numpy array
161
+ if isinstance(media, np.ndarray):
162
+ image = Image.fromarray(
163
+ (media * 255).astype(np.uint8) if media.max() <= 1
164
+ else media.astype(np.uint8)
165
+ )
166
+ else:
167
+ image = media
168
+
169
+ # Detect objects
170
+ objects = self._detect_objects(
171
+ image,
172
+ self.candidate_labels,
173
+ threshold=0.15
174
+ )
175
+
176
+ # Get top objects
177
+ top_objects = objects[:10]
178
+
179
+ # Categorize objects
180
+ categories = self._categorize_objects(top_objects)
181
+
182
+ result = {
183
+ "objects": top_objects,
184
+ "total_detected": len(objects),
185
+ "categories": categories,
186
+ "candidate_labels_count": len(self.candidate_labels),
187
+ "status": "success",
188
+ }
189
+
190
+ logger.debug(
191
+ f"Object detection complete: {len(top_objects)} objects found"
192
+ )
193
+
194
+ return result
195
+
196
+ except Exception as e:
197
+ logger.error(f"Object detection failed: {e}")
198
+ return {
199
+ "error": str(e),
200
+ "status": "failed"
201
+ }
202
+
203
+ def _categorize_objects(
204
+ self,
205
+ objects: List[Dict[str, Any]]
206
+ ) -> Dict[str, List[str]]:
207
+ """
208
+ Categorize detected objects.
209
+
210
+ Args:
211
+ objects: List of detected objects
212
+
213
+ Returns:
214
+ Dictionary of categories
215
+ """
216
+ categories = {
217
+ "people": [],
218
+ "animals": [],
219
+ "vehicles": [],
220
+ "nature": [],
221
+ "objects": [],
222
+ "places": [],
223
+ }
224
+
225
+ for obj in objects:
226
+ name = obj["name"]
227
+
228
+ if name in ["person", "people", "man", "woman", "child", "baby"]:
229
+ categories["people"].append(name)
230
+ elif name in ["dog", "cat", "bird", "animal"]:
231
+ categories["animals"].append(name)
232
+ elif name in ["car", "vehicle", "bicycle", "motorcycle"]:
233
+ categories["vehicles"].append(name)
234
+ elif name in ["tree", "plant", "flower", "nature", "landscape",
235
+ "mountain", "ocean", "beach"]:
236
+ categories["nature"].append(name)
237
+ elif name in ["indoor", "outdoor", "room", "street", "building",
238
+ "house"]:
239
+ categories["places"].append(name)
240
+ else:
241
+ categories["objects"].append(name)
242
+
243
+ # Remove empty categories
244
+ categories = {k: v for k, v in categories.items() if v}
245
+
246
+ return categories
247
+
248
+ def cleanup(self) -> None:
249
+ """Clean up model resources."""
250
+ if self.model is not None:
251
+ del self.model
252
+ self.model = None
253
+
254
+ if self.processor is not None:
255
+ del self.processor
256
+ self.processor = None
257
+
258
+ logger.info("ObjectDetectorPlugin cleanup complete")
requirements.txt ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ gradio==4.44.0
2
+ numpy>=1.24.0
3
+ pillow>=10.0.0
4
+ opencv-python>=4.8.0
5
+ loguru>=0.7.0
6
+
7
+ # Optional: For ML plugins (Object Detector & Caption Generator)
8
+ # Uncomment these lines to enable heavy ML features
9
+ # transformers>=4.30.0
10
+ # torch>=2.0.0
11
+ # torchvision>=0.15.0