deepvision-prompt-builder

Runtime error

App Files Files Community

Salman Abjam commited on Oct 30, 2025

Commit

eb5a9e1

0 Parent(s):

Initial deployment: DeepVision Prompt Builder v0.1.0

Browse files

Files changed (20) hide show

.gitignore +34 -0
DEPLOYMENT_GUIDE.md +204 -0
DEPLOY_NOW.md +201 -0
README.md +138 -0
app.py +349 -0
core/__init__.py +21 -0
core/config.py +131 -0
core/engine.py +471 -0
core/exceptions.py +100 -0
core/image_processor.py +279 -0
core/logging_config.py +71 -0
core/result_manager.py +347 -0
core/video_processor.py +333 -0
plugins/__init__.py +16 -0
plugins/base.py +170 -0
plugins/caption_generator.py +206 -0
plugins/color_analyzer.py +291 -0
plugins/loader.py +318 -0
plugins/object_detector.py +258 -0
requirements.txt +11 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,34 @@

+# .gitignore for Hugging Face Space
+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+env/
+venv/
+ENV/
+# Gradio cache
+flagged/
+gradio_cached_examples/
+# Models cache
+.cache/
+models/
+# Temporary files
+*.tmp
+*.log
+test_results.json
+# System files
+.DS_Store
+Thumbs.db
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo

DEPLOYMENT_GUIDE.md ADDED Viewed

	@@ -0,0 +1,204 @@

+# 🚀 Hugging Face Spaces Deployment Guide
+## Prerequisites
+1. **Hugging Face Account**
+   - Create account at: https://huggingface.co/join
+   - Verify your email
+2. **Git LFS** (for large model files)
+   ```bash
+   git lfs install
+   ```
+---
+## Step 1: Create New Space
+1. Go to: https://huggingface.co/new-space
+2. Fill in details:
+   - **Name**: `deepvision-prompt-builder`
+   - **License**: MIT
+   - **SDK**: Gradio
+   - **SDK Version**: 4.44.0
+   - **Hardware**: CPU Basic (free) or GPU (paid)
+   - **Visibility**: Public or Private
+3. Click "Create Space"
+---
+## Step 2: Clone and Setup
+```bash
+# Clone your space
+git clone https://huggingface.co/spaces/YOUR_USERNAME/deepvision-prompt-builder
+cd deepvision-prompt-builder
+# Copy files from this directory
+cp -r /path/to/huggingface_space/* .
+# Add all files
+git add .
+# Commit
+git commit -m "Initial deployment of DeepVision v0.1.0"
+# Push to Hugging Face
+git push
+```
+---
+## Step 3: Directory Structure
+Your Space should have this structure:
+```
+deepvision-prompt-builder/
+├── app.py                 # Main Gradio application
+├── README.md              # Space description (auto-displayed)
+├── requirements.txt       # Python dependencies
+├── .gitignore            # Git ignore rules
+├── core/                 # Core engine
+│   ├── __init__.py
+│   ├── engine.py
+│   ├── image_processor.py
+│   ├── video_processor.py
+│   ├── result_manager.py
+│   ├── config.py
+│   ├── exceptions.py
+│   └── logging_config.py
+└── plugins/              # Plugin system
+    ├── __init__.py
+    ├── base.py
+    ├── loader.py
+    ├── color_analyzer.py
+    ├── object_detector.py
+    └── caption_generator.py
+```
+---
+## Step 4: Test Locally First
+Before deploying, test locally:
+```bash
+# Install dependencies
+pip install -r requirements.txt
+# Run the app
+python app.py
+```
+Open browser at: http://localhost:7860
+Test with a sample image to ensure everything works.
+---
+## Step 5: Monitor Deployment
+After pushing:
+1. Go to your Space URL: `https://huggingface.co/spaces/YOUR_USERNAME/deepvision-prompt-builder`
+2. Watch the build logs in the "Logs" tab
+3. Wait for "Running" status (green)
+4. Test the live app
+---
+## Step 6: Configure Settings (Optional)
+### Enable GPU (Paid)
+1. Go to Space Settings
+2. Select "GPU" hardware
+3. Choose: T4 small ($0.60/hour) or A10G large ($3.15/hour)
+4. Click "Save"
+### Set Environment Variables
+```bash
+# In Space Settings → Environment Variables
+HF_HOME=/data/.cache
+TRANSFORMERS_CACHE=/data/.cache
+```
+---
+## Troubleshooting
+### Build Fails
+- Check requirements.txt for typos
+- Verify all imports are correct
+- Check logs for specific errors
+### Out of Memory
+- Reduce number of frames for video
+- Disable heavy plugins (Object Detector, Caption Generator)
+- Upgrade to GPU hardware
+### Slow Performance
+- First run downloads models (~2-5GB) - this is normal
+- Subsequent runs use cached models
+- Consider GPU upgrade for production use
+---
+## Cost Optimization
+### Free Tier (CPU Basic)
+- ✅ Color Analyzer works great
+- ⚠️ Object Detector & Caption Generator are slow
+- 📊 Suitable for demos and light usage
+### Paid Tier (GPU)
+- 💰 T4 Small: $0.60/hour (~$432/month if always on)
+- 💰 A10G Large: $3.15/hour (~$2,268/month if always on)
+- 💡 Use "Pause Space" feature when not in use to save costs
+---
+## Update Space
+```bash
+# Make changes locally
+# Test locally
+# Commit and push
+git add .
+git commit -m "Update: description of changes"
+git push
+# Space will auto-rebuild
+```
+---
+## Custom Domain (Optional)
+Hugging Face Spaces provides:
+- Default URL: `YOUR_USERNAME-deepvision-prompt-builder.hf.space`
+- You can add custom domain in Space Settings
+---
+## Next Steps
+After successful deployment:
+1. ✅ Share your Space URL
+2. 📝 Write a blog post announcement
+3. 🎥 Create demo video
+4. 📊 Monitor usage analytics
+5. 🐛 Collect user feedback
+---
+## Resources
+- 📚 [Hugging Face Spaces Documentation](https://huggingface.co/docs/hub/spaces)
+- 🎨 [Gradio Documentation](https://gradio.app/docs)
+- 💬 [Community Forum](https://discuss.huggingface.co/)
+---
+**Ready to deploy! 🚀**

DEPLOY_NOW.md ADDED Viewed

	@@ -0,0 +1,201 @@

+# 🚀 Hugging Face Spaces Deployment Instructions
+## ✅ Prerequisites Completed:
+- ✅ Git repository initialized
+- ✅ All code files ready
+- ✅ Local testing passed
+---
+## 📋 Step-by-Step Deployment Guide
+### Step 1: Create Hugging Face Account (if needed)
+1. Go to: https://huggingface.co/join
+2. Sign up with email or GitHub
+3. Verify your email address
+---
+### Step 2: Create New Space
+1. **Go to**: https://huggingface.co/new-space
+2. **Fill in the details**:
+   ```
+   Owner: YOUR_USERNAME
+   Space name: deepvision-prompt-builder
+   License: MIT
+   Select SDK: Gradio
+   SDK Version: 4.44.0
+   Space hardware: CPU basic (free)
+   Visibility: Public
+   ```
+3. **Click**: "Create Space"
+---
+### Step 3: Clone Your New Space
+```powershell
+# Navigate to parent directory
+cd "E:\Ai\Projects\BRAINixIDEX\ThinkTank DVP"
+# Clone your space (replace YOUR_USERNAME)
+git clone https://huggingface.co/spaces/YOUR_USERNAME/deepvision-prompt-builder
+```
+---
+### Step 4: Copy Files to Cloned Space
+```powershell
+# Copy all files from huggingface_space to cloned space
+Copy-Item -Path "huggingface_space\*" -Destination "deepvision-prompt-builder\" -Recurse -Force
+# Navigate to the cloned space
+cd deepvision-prompt-builder
+```
+---
+### Step 5: Configure Git and Push
+```powershell
+# Add all files
+git add .
+# Commit
+git commit -m "Initial deployment: DeepVision Prompt Builder v0.1.0
+Features:
+- Gradio web interface
+- Color Analyzer plugin (fast)
+- Image and video support
+- JSON output
+- Real-time analysis"
+# Push to Hugging Face
+git push
+```
+**Note**: You'll be prompted for Hugging Face credentials:
+- Username: your HF username
+- Password: use a **HF Access Token** (not your password)
+---
+### Step 6: Get Hugging Face Access Token
+1. Go to: https://huggingface.co/settings/tokens
+2. Click "New token"
+3. Name: `DeepVision Deploy`
+4. Role: `write`
+5. Click "Generate token"
+6. **Copy the token** (you'll need it for git push)
+---
+### Step 7: Monitor Deployment
+1. Go to your Space: `https://huggingface.co/spaces/YOUR_USERNAME/deepvision-prompt-builder`
+2. Click on "Logs" tab
+3. Wait for build to complete (usually 2-5 minutes)
+4. Status will change to "Running" (green)
+---
+### Step 8: Test Your Deployed App
+1. Open your Space URL
+2. Upload a test image
+3. Enable Color Analyzer
+4. Click Analyze
+5. Verify results
+---
+## 🎉 Alternative: Quick Deploy (Manual Upload)
+If you don't want to use Git:
+1. Create Space on Hugging Face
+2. Click "Files" tab
+3. Click "Add file" → "Upload files"
+4. Drag and drop ALL files from `huggingface_space/` folder
+5. Wait for upload to complete
+6. Space will auto-rebuild
+---
+## 📊 Expected Results
+**Build Time**: 2-5 minutes
+**Startup Time**: 10-30 seconds
+**URL**: `https://YOUR_USERNAME-deepvision-prompt-builder.hf.space`
+---
+## ⚙️ Configuration Options
+### Enable GPU (Optional - Paid)
+1. Go to Space Settings
+2. Change Hardware to:
+   - `CPU basic` (free) ✅ Recommended for demo
+   - `T4 small` ($0.60/hour) - For faster ML plugins
+   - `A10G large` ($3.15/hour) - For heavy usage
+### Set as Private
+1. Go to Space Settings
+2. Change Visibility to "Private"
+3. Only you can access it
+---
+## 🐛 Troubleshooting
+### Build Fails
+- Check logs for errors
+- Verify all files are uploaded
+- Check requirements.txt syntax
+### App doesn't start
+- Check if port 7860 is used (default for HF)
+- Verify app.py has no syntax errors
+- Check logs for Python errors
+### Slow performance
+- Normal on free CPU tier
+- ML plugins (Object Detector, Caption) will be slow
+- Color Analyzer should work fast
+---
+## 📝 Post-Deployment
+After successful deployment:
+1. ✅ Test with multiple images
+2. ✅ Share the URL
+3. ✅ Update README.md with live demo link
+4. ✅ Monitor usage analytics
+5. ✅ Collect user feedback
+---
+## 🔗 Useful Links
+- **Your Space**: https://huggingface.co/spaces/YOUR_USERNAME/deepvision-prompt-builder
+- **HF Docs**: https://huggingface.co/docs/hub/spaces
+- **Gradio Docs**: https://gradio.app/docs
+- **Support**: https://discuss.huggingface.co/
+---
+**Ready to deploy! 🚀**
+Choose your method:
+1. **Git Method** (recommended): Follow Steps 1-8
+2. **Manual Upload**: Use Alternative method

README.md ADDED Viewed

	@@ -0,0 +1,138 @@

+---
+title: DeepVision Prompt Builder
+emoji: 🎯
+colorFrom: blue
+colorTo: purple
+sdk: gradio
+sdk_version: 4.44.0
+app_file: app.py
+pinned: false
+license: mit
+python_version: 3.10
+---
+# 🎯 DeepVision Prompt Builder
+**AI-Powered Image & Video Analysis with Automatic JSON Prompt Generation**
+## Overview
+DeepVision is a modular AI system that analyzes images and videos to generate structured JSON prompts. Perfect for:
+- 📸 Automated image tagging
+- 🎬 Video content analysis
+- 🤖 AI training data preparation
+- 📊 Media cataloging
+- 🎨 Creative prompt generation
+## Features
+### Available Plugins
+- **🎨 Color Analyzer** (Fast): Extract dominant colors, color schemes, brightness, and saturation
+- **🔍 Object Detector** (CLIP): Zero-shot object detection with confidence scores
+- **💬 Caption Generator** (BLIP-2): Natural language image descriptions
+### Supported Formats
+- **Images**: JPG, PNG, WebP, BMP, GIF
+- **Videos**: MP4, AVI, MOV, MKV
+## Usage
+1. Upload an image or video file
+2. Select which analysis plugins to use
+3. Click "Analyze" to process
+4. View results in formatted or JSON format
+5. Download JSON output for use in other systems
+## Performance Notes
+- **Color Analyzer**: ~1-2 seconds per image, lightweight
+- **Object Detector**: First use downloads ~2GB CLIP model, then ~5-10 seconds per image
+- **Caption Generator**: First use downloads ~2-5GB BLIP-2 model, then ~8-15 seconds per image
+- **Video Analysis**: Processes N keyframes (configurable 1-20 frames)
+## Example Output
+```json
+{
+  "results": {
+    "color_analyzer": {
+      "dominant_colors": [
+        {"color": [45, 85, 125], "percentage": 35.2, "name": "blue"}
+      ],
+      "color_scheme": "cool",
+      "average_brightness": 128.5,
+      "average_saturation": 0.65
+    }
+  },
+  "metadata": {
+    "file": {
+      "filename": "example.jpg",
+      "size_mb": 2.4,
+      "width": 1920,
+      "height": 1080
+    },
+    "processing": {
+      "duration_seconds": 1.234,
+      "plugins_used": ["color_analyzer"]
+    }
+  }
+}
+```
+## Technology Stack
+- **Framework**: Python 3.10+
+- **UI**: Gradio 4.44+
+- **CV**: OpenCV, PIL, NumPy
+- **AI Models**: CLIP, BLIP-2 (via HuggingFace Transformers)
+- **Logging**: Loguru
+## Architecture
+DeepVision uses a plugin-based architecture:
+- **Core Engine**: Orchestrates analysis pipeline
+- **Plugin System**: Modular, extensible analysis components
+- **Result Manager**: Aggregates and formats outputs
+## Local Development
+```bash
+# Clone repository
+git clone https://huggingface.co/spaces/YOUR_USERNAME/deepvision
+cd deepvision
+# Install dependencies
+pip install -r requirements.txt
+# Run locally
+python app.py
+```
+## License
+MIT License - Free to use and modify
+## Credits
+**Built by AI Dev Collective v9.0**
+- Astro (Lead Developer)
+- Lyra (Research)
+- Nexus (Code Quality)
+- CryptoX (Security)
+- NOVA (UI/UX)
+- Echo (Performance)
+- Sage (Documentation)
+- Pulse (DevOps)
+## Links
+- 📚 [Full Documentation](https://github.com/yourusername/deepvision)
+- 🐛 [Report Issues](https://github.com/yourusername/deepvision/issues)
+- 💡 [Feature Requests](https://github.com/yourusername/deepvision/discussions)
+---
+**Version**: 0.1.0
+**Last Updated**: January 2025

app.py ADDED Viewed

	@@ -0,0 +1,349 @@

+"""
+DeepVision Prompt Builder - Gradio Interface
+Hugging Face Spaces Deployment
+This is the main Gradio application for the DeepVision Prompt Builder.
+It provides a web interface for uploading images/videos and viewing analysis results.
+"""
+import gradio as gr
+import json
+from pathlib import Path
+from typing import Dict, Any, Tuple
+import tempfile
+# Import core components
+from core.engine import AnalysisEngine
+from plugins.loader import PluginLoader
+from core.logging_config import setup_logging
+from loguru import logger
+# Setup logging
+setup_logging()
+class DeepVisionGradioApp:
+    """
+    Gradio web interface for DeepVision Prompt Builder
+    """
+    def __init__(self):
+        """Initialize the Gradio app"""
+        self.engine = AnalysisEngine()
+        self.plugin_loader = PluginLoader()
+        self.setup_plugins()
+        logger.info("DeepVision Gradio App initialized")
+    def setup_plugins(self):
+        """Load and register all available plugins"""
+        try:
+            # Load all plugins
+            plugins = self.plugin_loader.load_all_plugins()
+            # Register plugins with engine
+            for name, plugin in plugins.items():
+                self.engine.register_plugin(name, plugin)
+                logger.info(f"Plugin registered: {name}")
+            logger.success(f"Loaded {len(plugins)} plugins successfully")
+        except Exception as e:
+            logger.error(f"Error loading plugins: {e}")
+    def analyze_media(
+        self,
+        file_path: str,
+        use_color_analyzer: bool = True,
+        use_object_detector: bool = False,
+        use_caption_generator: bool = False,
+        num_frames: int = 5
+    ) -> Tuple[str, str]:
+        """
+        Analyze uploaded image or video
+        Args:
+            file_path: Path to uploaded file
+            use_color_analyzer: Enable color analysis
+            use_object_detector: Enable object detection (heavy)
+            use_caption_generator: Enable caption generation (heavy)
+            num_frames: Number of frames to extract from video
+        Returns:
+            Tuple of (formatted results text, JSON string)
+        """
+        try:
+            logger.info(f"Analyzing file: {file_path}")
+            # Enable/disable plugins based on user selection
+            self._configure_plugins(
+                use_color_analyzer,
+                use_object_detector,
+                use_caption_generator
+            )
+            # Detect file type and analyze
+            file_path_obj = Path(file_path)
+            if file_path_obj.suffix.lower() in ['.mp4', '.avi', '.mov', '.mkv']:
+                # Video analysis
+                logger.info(f"Processing video with {num_frames} frames")
+                results = self.engine.analyze_video(
+                    file_path,
+                    extract_method="keyframes",
+                    num_frames=num_frames
+                )
+            else:
+                # Image analysis
+                logger.info("Processing image")
+                results = self.engine.analyze_image(file_path)
+            # Format results for display
+            formatted_text = self._format_results(results)
+            json_output = json.dumps(results, indent=2, ensure_ascii=False)
+            logger.success("Analysis completed successfully")
+            return formatted_text, json_output
+        except Exception as e:
+            logger.error(f"Analysis error: {e}")
+            error_msg = f"❌ Error: {str(e)}"
+            error_json = json.dumps({"error": str(e)}, indent=2)
+            return error_msg, error_json
+    def _configure_plugins(
+        self,
+        use_color: bool,
+        use_object: bool,
+        use_caption: bool
+    ):
+        """Enable/disable plugins based on user selection"""
+        plugin_config = {
+            'color_analyzer': use_color,
+            'object_detector': use_object,
+            'caption_generator': use_caption
+        }
+        for plugin_name, enabled in plugin_config.items():
+            if plugin_name in self.engine.plugins:
+                self.engine.plugins[plugin_name].enabled = enabled
+                logger.info(f"Plugin '{plugin_name}': {'enabled' if enabled else 'disabled'}")
+    def _format_results(self, results: Dict[str, Any]) -> str:
+        """Format analysis results as readable text"""
+        lines = ["# 🎯 Analysis Results\n"]
+        # File metadata
+        if "metadata" in results and "file" in results["metadata"]:
+            meta = results["metadata"]["file"]
+            lines.append("## 📁 File Information")
+            lines.append(f"- **Filename**: {meta.get('filename', 'N/A')}")
+            lines.append(f"- **Type**: {meta.get('type', 'N/A')}")
+            lines.append(f"- **Size**: {meta.get('size_mb', 0):.2f} MB")
+            if meta.get('type') == 'video':
+                lines.append(f"- **Resolution**: {meta.get('width')}x{meta.get('height')}")
+                lines.append(f"- **Duration**: {meta.get('duration', 0):.2f} seconds")
+                lines.append(f"- **FPS**: {meta.get('fps', 0):.2f}")
+            else:
+                lines.append(f"- **Resolution**: {meta.get('width')}x{meta.get('height')}")
+            lines.append("")
+        # Processing info
+        if "metadata" in results and "processing" in results["metadata"]:
+            proc = results["metadata"]["processing"]
+            lines.append("## ⚡ Processing Information")
+            lines.append(f"- **Duration**: {proc.get('duration_seconds', 0):.3f} seconds")
+            lines.append(f"- **Plugins Used**: {', '.join(proc.get('plugins_used', []))}")
+            if proc.get('frames_extracted'):
+                lines.append(f"- **Frames Analyzed**: {proc.get('frames_extracted')}")
+            lines.append("")
+        # Analysis results
+        if "results" in results:
+            res = results["results"]
+            # For videos
+            if "frames" in res:
+                lines.append(f"## 🎬 Video Analysis ({len(res['frames'])} frames)")
+                # Summary
+                if "summary" in res:
+                    for plugin_name, summary_data in res["summary"].items():
+                        lines.append(f"\n### {plugin_name.replace('_', ' ').title()}")
+                        lines.append(f"```json\n{json.dumps(summary_data, indent=2, ensure_ascii=False)}\n```")
+            # For images
+            else:
+                lines.append("## 🖼️ Image Analysis")
+                for plugin_name, plugin_data in res.items():
+                    lines.append(f"\n### {plugin_name.replace('_', ' ').title()}")
+                    lines.append(f"```json\n{json.dumps(plugin_data, indent=2, ensure_ascii=False)}\n```")
+        return "\n".join(lines)
+    def create_interface(self) -> gr.Blocks:
+        """Create and return the Gradio interface"""
+        with gr.Blocks(
+            title="DeepVision Prompt Builder",
+            theme="soft",
+            css="""
+                .output-text { font-family: 'Courier New', monospace; }
+                .json-output { font-size: 12px; }
+            """
+        ) as demo:
+            # Header
+            gr.Markdown("""
+            # 🎯 DeepVision Prompt Builder
+            ### AI-Powered Image & Video Analysis with JSON Prompt Generation
+            Upload an image or video to analyze its content and generate structured JSON prompts.
+            """)
+            with gr.Row():
+                with gr.Column(scale=1):
+                    # Input section
+                    gr.Markdown("## 📤 Upload Media")
+                    file_input = gr.File(
+                        label="Upload Image or Video",
+                        file_types=["image", "video"],
+                        type="filepath"
+                    )
+                    gr.Markdown("### 🔌 Plugin Configuration")
+                    color_checkbox = gr.Checkbox(
+                        label="🎨 Color Analyzer (Fast)",
+                        value=True,
+                        info="Extract dominant colors and color schemes"
+                    )
+                    object_checkbox = gr.Checkbox(
+                        label="🔍 Object Detector (Slow - CLIP)",
+                        value=False,
+                        info="Detect objects using CLIP model (~2-5GB download)"
+                    )
+                    caption_checkbox = gr.Checkbox(
+                        label="💬 Caption Generator (Slow - BLIP-2)",
+                        value=False,
+                        info="Generate image captions (~2-5GB download)"
+                    )
+                    frames_slider = gr.Slider(
+                        minimum=1,
+                        maximum=20,
+                        value=5,
+                        step=1,
+                        label="📹 Video Frames to Extract",
+                        info="More frames = more accurate but slower"
+                    )
+                    analyze_btn = gr.Button(
+                        "🚀 Analyze",
+                        variant="primary",
+                        size="lg"
+                    )
+                with gr.Column(scale=2):
+                    # Output section
+                    gr.Markdown("## 📊 Analysis Results")
+                    with gr.Tabs():
+                        with gr.Tab("📝 Formatted"):
+                            output_text = gr.Markdown(
+                                label="Results",
+                                elem_classes=["output-text"]
+                            )
+                        with gr.Tab("📋 JSON"):
+                            output_json = gr.Code(
+                                label="JSON Output",
+                                language="json",
+                                elem_classes=["json-output"],
+                                lines=20
+                            )
+                    download_btn = gr.DownloadButton(
+                        label="💾 Download JSON",
+                        visible=False
+                    )
+            # Examples
+            gr.Markdown("## 💡 Example Usage")
+            gr.Markdown("""
+            1. **Quick Test**: Upload an image with only Color Analyzer enabled
+            2. **Full Analysis**: Enable all plugins (requires model downloads)
+            3. **Video Analysis**: Upload a video and adjust frame count
+            **Note**: First-time use of Object Detector and Caption Generator will download ~2-5GB models.
+            """)
+            # Footer
+            gr.Markdown("""
+            ---
+            **DeepVision Prompt Builder v0.1.0** | Built with ❤️ by AI Dev Collective
+            📚 [Documentation](https://github.com/yourusername/deepvision) |
+            🐛 [Report Issues](https://github.com/yourusername/deepvision/issues)
+            """)
+            # Event handlers
+            def analyze_and_prepare_download(file, color, obj, cap, frames):
+                """Analyze and prepare results for download"""
+                if file is None:
+                    return "⚠️ Please upload a file first", "{}", gr.update(visible=False)
+                text_result, json_result = self.analyze_media(
+                    file, color, obj, cap, frames
+                )
+                # Save JSON to temp file for download
+                temp_file = tempfile.NamedTemporaryFile(
+                    mode='w',
+                    suffix='.json',
+                    delete=False,
+                    encoding='utf-8'
+                )
+                temp_file.write(json_result)
+                temp_file.close()
+                return (
+                    text_result,
+                    json_result,
+                    gr.update(visible=True, value=temp_file.name)
+                )
+            analyze_btn.click(
+                fn=analyze_and_prepare_download,
+                inputs=[
+                    file_input,
+                    color_checkbox,
+                    object_checkbox,
+                    caption_checkbox,
+                    frames_slider
+                ],
+                outputs=[output_text, output_json, download_btn]
+            )
+        return demo
+def main():
+    """Main entry point for the Gradio app"""
+    app = DeepVisionGradioApp()
+    demo = app.create_interface()
+    # Launch the app
+    demo.launch(
+        server_name="127.0.0.1",  # Local only for testing
+        server_port=None,         # Auto-find available port
+        share=False,              # Set to True for temporary public link
+        show_error=True,
+        inbrowser=True            # Auto-open in browser
+    )
+if __name__ == "__main__":
+    main()

core/__init__.py ADDED Viewed

	@@ -0,0 +1,21 @@

+"""
+DeepVision Prompt Builder - Core Engine
+This module contains the core functionality for analyzing images and videos,
+managing plugins, and generating structured JSON prompts.
+"""
+__version__ = "0.1.0"
+__author__ = "AI Dev Collective v9.0"
+from core.engine import AnalysisEngine
+from core.image_processor import ImageProcessor
+from core.video_processor import VideoProcessor
+from core.result_manager import ResultManager
+__all__ = [
+    "AnalysisEngine",
+    "ImageProcessor",
+    "VideoProcessor",
+    "ResultManager",
+]

core/config.py ADDED Viewed

	@@ -0,0 +1,131 @@

+"""
+Configuration module for DeepVision Core Engine.
+Manages all configuration settings including paths, model settings,
+processing parameters, and resource limits.
+"""
+import os
+from pathlib import Path
+from typing import Dict, List, Optional
+from pydantic_settings import BaseSettings
+from pydantic import Field
+class CoreConfig(BaseSettings):
+    """Core configuration settings."""
+    # Application
+    APP_NAME: str = "DeepVision Prompt Builder"
+    APP_VERSION: str = "0.1.0"
+    DEBUG: bool = Field(default=False, env="DEBUG")
+    # Paths
+    BASE_DIR: Path = Path(__file__).parent.parent
+    UPLOAD_DIR: Path = Field(default=Path("/var/uploads"), env="UPLOAD_DIR")
+    CACHE_DIR: Path = Field(default=Path("/var/cache"), env="CACHE_DIR")
+    MODEL_DIR: Path = Field(default=Path("models"), env="MODEL_DIR")
+    # File Processing
+    MAX_IMAGE_SIZE: int = Field(default=50 * 1024 * 1024, env="MAX_IMAGE_SIZE")  # 50MB
+    MAX_VIDEO_SIZE: int = Field(default=200 * 1024 * 1024, env="MAX_VIDEO_SIZE")  # 200MB
+    ALLOWED_IMAGE_FORMATS: List[str] = [".jpg", ".jpeg", ".png", ".gif", ".webp"]
+    ALLOWED_VIDEO_FORMATS: List[str] = [".mp4", ".mov", ".avi"]
+    # Image Processing
+    IMAGE_MAX_DIMENSION: int = 2048  # Max width or height
+    IMAGE_QUALITY: int = 85  # JPEG quality
+    DEFAULT_IMAGE_SIZE: tuple = (512, 512)  # Default resize
+    # Video Processing
+    VIDEO_FPS_EXTRACTION: int = 1  # Extract 1 frame per second
+    MAX_FRAMES_PER_VIDEO: int = 100  # Maximum frames to extract
+    # Model Settings
+    DEVICE: str = Field(default="cpu", env="DEVICE")  # cpu or cuda
+    MODEL_BATCH_SIZE: int = 4
+    MODEL_CACHE_SIZE: int = 3  # Max models in memory
+    # Performance
+    MAX_WORKERS: int = Field(default=4, env="MAX_WORKERS")
+    ENABLE_CACHING: bool = True
+    CACHE_TTL: int = 3600  # Cache time-to-live in seconds
+    # Output
+    OUTPUT_FORMAT: str = "json"  # json, dict
+    PRETTY_JSON: bool = True
+    INCLUDE_METADATA: bool = True
+    class Config:
+        env_file = ".env"
+        env_file_encoding = "utf-8"
+        case_sensitive = True
+    def __init__(self, **kwargs):
+        super().__init__(**kwargs)
+        # Create directories if they don't exist
+        self.UPLOAD_DIR.mkdir(parents=True, exist_ok=True)
+        self.CACHE_DIR.mkdir(parents=True, exist_ok=True)
+        self.MODEL_DIR.mkdir(parents=True, exist_ok=True)
+# Global config instance
+config = CoreConfig()
+# Model configurations
+MODEL_CONFIGS: Dict[str, Dict] = {
+    "clip": {
+        "name": "openai/clip-vit-base-patch32",
+        "task": "feature_extraction",
+        "device": config.DEVICE,
+    },
+    "blip2": {
+        "name": "Salesforce/blip2-opt-2.7b",
+        "task": "image_captioning",
+        "device": config.DEVICE,
+    },
+    "sam": {
+        "name": "facebook/sam-vit-base",
+        "task": "segmentation",
+        "device": config.DEVICE,
+    },
+}
+# Plugin configurations
+PLUGIN_CONFIGS: Dict[str, Dict] = {
+    "object_detector": {
+        "enabled": True,
+        "model": "clip",
+        "confidence_threshold": 0.5,
+    },
+    "caption_generator": {
+        "enabled": True,
+        "model": "blip2",
+        "max_length": 50,
+    },
+    "color_analyzer": {
+        "enabled": True,
+        "num_colors": 5,
+    },
+    "text_extractor": {
+        "enabled": False,  # Requires OCR model
+        "model": "easyocr",
+    },
+    "emotion_reader": {
+        "enabled": False,  # Requires face detection model
+        "model": "deepface",
+    },
+}
+def get_plugin_config(plugin_name: str) -> Optional[Dict]:
+    """Get configuration for a specific plugin."""
+    return PLUGIN_CONFIGS.get(plugin_name)
+def is_plugin_enabled(plugin_name: str) -> bool:
+    """Check if a plugin is enabled."""
+    plugin_config = get_plugin_config(plugin_name)
+    return plugin_config.get("enabled", False) if plugin_config else False

core/engine.py ADDED Viewed

	@@ -0,0 +1,471 @@

+"""
+Core Analysis Engine
+Main orchestration engine for DeepVision Prompt Builder.
+Manages image/video processing, plugin execution, and result generation.
+"""
+from datetime import datetime
+from pathlib import Path
+from typing import Dict, List, Any, Optional, Union
+from loguru import logger
+from core.config import config
+from core.image_processor import ImageProcessor
+from core.video_processor import VideoProcessor
+from core.result_manager import ResultManager
+from core.exceptions import DeepVisionError
+class AnalysisEngine:
+    """
+    Main analysis engine for processing images and videos.
+    Orchestrates the complete analysis pipeline:
+    1. File validation and preprocessing
+    2. Plugin execution
+    3. Result aggregation
+    4. JSON output generation
+    """
+    def __init__(self):
+        """Initialize AnalysisEngine."""
+        self.image_processor = ImageProcessor()
+        self.video_processor = VideoProcessor()
+        self.result_manager = ResultManager()
+        self.plugins: Dict[str, Any] = {}
+        self.plugin_order: List[str] = []
+        logger.info(f"AnalysisEngine initialized - {config.APP_NAME} v{config.APP_VERSION}")
+    def register_plugin(self, plugin_name: str, plugin_instance: Any) -> None:
+        """
+        Register a plugin for analysis.
+        Args:
+            plugin_name: Unique name for the plugin
+            plugin_instance: Instance of the plugin class
+        """
+        if plugin_name in self.plugins:
+            logger.warning(f"Plugin '{plugin_name}' already registered, replacing")
+        self.plugins[plugin_name] = plugin_instance
+        # Maintain execution order
+        if plugin_name not in self.plugin_order:
+            self.plugin_order.append(plugin_name)
+        logger.info(f"Registered plugin: {plugin_name}")
+    def unregister_plugin(self, plugin_name: str) -> None:
+        """
+        Unregister a plugin.
+        Args:
+            plugin_name: Name of plugin to remove
+        """
+        if plugin_name in self.plugins:
+            del self.plugins[plugin_name]
+            if plugin_name in self.plugin_order:
+                self.plugin_order.remove(plugin_name)
+            logger.info(f"Unregistered plugin: {plugin_name}")
+    def get_registered_plugins(self) -> List[str]:
+        """
+        Get list of registered plugins.
+        Returns:
+            List of plugin names
+        """
+        return list(self.plugins.keys())
+    def analyze_image(
+        self,
+        image_path: Union[str, Path],
+        plugins: Optional[List[str]] = None,
+        **kwargs
+    ) -> Dict[str, Any]:
+        """
+        Analyze a single image.
+        Args:
+            image_path: Path to image file
+            plugins: List of plugin names to use (None for all)
+            **kwargs: Additional arguments for processing
+        Returns:
+            Analysis results dictionary
+        """
+        start_time = datetime.now()
+        image_path = Path(image_path)
+        logger.info(f"Starting image analysis: {image_path.name}")
+        try:
+            # Clear previous results
+            self.result_manager.clear()
+            # Process image
+            image = self.image_processor.process(
+                image_path,
+                resize=kwargs.get("resize", True),
+                normalize=kwargs.get("normalize", False)
+            )
+            # Get image info
+            image_info = self.image_processor.get_image_info(image_path)
+            # Set file metadata
+            self.result_manager.set_file_info(
+                filename=image_info["filename"],
+                file_type="image",
+                file_size=image_info["file_size"],
+                width=image_info["width"],
+                height=image_info["height"],
+                format=image_info["format"],
+                hash=image_info["hash"],
+            )
+            # Execute plugins
+            plugins_used = self._execute_plugins(
+                image,
+                image_path,
+                plugins,
+                media_type="image"
+            )
+            # Set processing metadata
+            end_time = datetime.now()
+            self.result_manager.set_processing_info(
+                start_time=start_time,
+                end_time=end_time,
+                plugins_used=plugins_used
+            )
+            # Get final results
+            results = self.result_manager.to_dict(
+                include_metadata=config.INCLUDE_METADATA
+            )
+            logger.info(f"Image analysis completed: {image_path.name} "
+                       f"({len(plugins_used)} plugins)")
+            return results
+        except Exception as e:
+            logger.error(f"Image analysis failed: {e}")
+            raise DeepVisionError(
+                f"Analysis failed for {image_path.name}: {str(e)}",
+                {"path": str(image_path), "error": str(e)}
+            )
+    def analyze_video(
+        self,
+        video_path: Union[str, Path],
+        plugins: Optional[List[str]] = None,
+        extract_method: str = "keyframes",
+        num_frames: int = 5,
+        **kwargs
+    ) -> Dict[str, Any]:
+        """
+        Analyze a video by extracting and analyzing frames.
+        Args:
+            video_path: Path to video file
+            plugins: List of plugin names to use
+            extract_method: Frame extraction method ("fps" or "keyframes")
+            num_frames: Number of frames to extract
+            **kwargs: Additional arguments
+        Returns:
+            Analysis results dictionary
+        """
+        start_time = datetime.now()
+        video_path = Path(video_path)
+        logger.info(f"Starting video analysis: {video_path.name}")
+        try:
+            # Clear previous results
+            self.result_manager.clear()
+            # Get video info
+            video_info = self.video_processor.get_video_info(video_path)
+            # Set file metadata
+            self.result_manager.set_file_info(
+                filename=video_info["filename"],
+                file_type="video",
+                file_size=video_info["file_size"],
+                width=video_info["width"],
+                height=video_info["height"],
+                fps=video_info["fps"],
+                duration=video_info["duration"],
+                frame_count=video_info["frame_count"],
+            )
+            # Extract frames
+            if extract_method == "keyframes":
+                frame_paths = self.video_processor.extract_key_frames(
+                    video_path,
+                    num_frames=num_frames
+                )
+            else:
+                frame_paths = self.video_processor.extract_frames(
+                    video_path,
+                    max_frames=num_frames,
+                    **kwargs
+                )
+            logger.info(f"Extracted {len(frame_paths)} frames from video")
+            # Analyze each frame
+            frame_results = []
+            for idx, frame_path in enumerate(frame_paths):
+                logger.info(f"Analyzing frame {idx + 1}/{len(frame_paths)}")
+                # Process frame
+                image = self.image_processor.process(frame_path, resize=True)
+                # Execute plugins on frame
+                plugins_used = self._execute_plugins(
+                    image,
+                    frame_path,
+                    plugins,
+                    media_type="video_frame"
+                )
+                # Get frame results
+                frame_result = {
+                    "frame_index": idx,
+                    "frame_path": str(frame_path.name),
+                    "results": dict(self.result_manager.results)
+                }
+                frame_results.append(frame_result)
+                # Clear for next frame
+                self.result_manager.results.clear()
+            # Aggregate frame results
+            aggregated = self._aggregate_video_results(frame_results)
+            # Set aggregated results
+            self.result_manager.results = aggregated
+            # Set processing metadata
+            end_time = datetime.now()
+            self.result_manager.set_processing_info(
+                start_time=start_time,
+                end_time=end_time,
+                plugins_used=plugins_used
+            )
+            # Add video-specific metadata
+            self.result_manager.add_metadata({
+                "frames_analyzed": len(frame_paths),
+                "extraction_method": extract_method,
+            })
+            # Get final results
+            results = self.result_manager.to_dict(
+                include_metadata=config.INCLUDE_METADATA
+            )
+            logger.info(f"Video analysis completed: {video_path.name} "
+                       f"({len(frame_paths)} frames, {len(plugins_used)} plugins)")
+            return results
+        except Exception as e:
+            logger.error(f"Video analysis failed: {e}")
+            raise DeepVisionError(
+                f"Analysis failed for {video_path.name}: {str(e)}",
+                {"path": str(video_path), "error": str(e)}
+            )
+    def _execute_plugins(
+        self,
+        media,
+        media_path: Path,
+        plugin_names: Optional[List[str]] = None,
+        media_type: str = "image"
+    ) -> List[str]:
+        """
+        Execute registered plugins on media.
+        Args:
+            media: Processed media (image or frame)
+            media_path: Path to media file
+            plugin_names: List of plugins to execute (None for all)
+            media_type: Type of media being processed
+        Returns:
+            List of executed plugin names
+        """
+        # Determine which plugins to execute
+        if plugin_names is None:
+            plugins_to_run = self.plugin_order
+        else:
+            plugins_to_run = [
+                p for p in self.plugin_order if p in plugin_names
+            ]
+        executed = []
+        for plugin_name in plugins_to_run:
+            if plugin_name not in self.plugins:
+                logger.warning(f"Plugin '{plugin_name}' not found, skipping")
+                continue
+            try:
+                logger.debug(f"Executing plugin: {plugin_name}")
+                plugin = self.plugins[plugin_name]
+                # Execute plugin
+                result = plugin.analyze(media, media_path)
+                # Add result
+                self.result_manager.add_result(plugin_name, result)
+                executed.append(plugin_name)
+                logger.debug(f"Plugin '{plugin_name}' completed successfully")
+            except Exception as e:
+                logger.error(f"Plugin '{plugin_name}' failed: {e}")
+                # Add error to results
+                self.result_manager.add_result(
+                    plugin_name,
+                    {
+                        "error": str(e),
+                        "status": "failed"
+                    }
+                )
+        return executed
+    def _aggregate_video_results(
+        self,
+        frame_results: List[Dict[str, Any]]
+    ) -> Dict[str, Any]:
+        """
+        Aggregate results from multiple video frames.
+        Args:
+            frame_results: List of results from each frame
+        Returns:
+            Aggregated results dictionary
+        """
+        aggregated = {
+            "frames": frame_results,
+            "summary": {}
+        }
+        # For each plugin, aggregate results across frames
+        if not frame_results:
+            return aggregated
+        # Get plugin names from first frame
+        first_frame = frame_results[0]["results"]
+        for plugin_name in first_frame.keys():
+            plugin_summary = self._aggregate_plugin_results(
+                plugin_name,
+                [f["results"].get(plugin_name, {}) for f in frame_results]
+            )
+            aggregated["summary"][plugin_name] = plugin_summary
+        return aggregated
+    def _aggregate_plugin_results(
+        self,
+        plugin_name: str,
+        results: List[Dict[str, Any]]
+    ) -> Dict[str, Any]:
+        """
+        Aggregate results for a specific plugin across frames.
+        Args:
+            plugin_name: Name of the plugin
+            results: List of results from each frame
+        Returns:
+            Aggregated result for the plugin
+        """
+        # Default aggregation: collect all unique values
+        aggregated = {
+            "frames_processed": len(results),
+        }
+        # Plugin-specific aggregation logic
+        if plugin_name == "object_detector":
+            all_objects = []
+            for result in results:
+                all_objects.extend(result.get("objects", []))
+            # Count object occurrences
+            object_counts = {}
+            for obj in all_objects:
+                name = obj["name"]
+                object_counts[name] = object_counts.get(name, 0) + 1
+            aggregated["total_objects"] = len(all_objects)
+            aggregated["unique_objects"] = len(object_counts)
+            aggregated["object_frequency"] = object_counts
+        elif plugin_name == "caption_generator":
+            captions = [r.get("caption", "") for r in results if r.get("caption")]
+            aggregated["captions"] = captions
+            aggregated["caption_count"] = len(captions)
+        elif plugin_name == "color_analyzer":
+            all_colors = []
+            for result in results:
+                all_colors.extend(result.get("dominant_colors", []))
+            # Get most frequent colors
+            color_counts = {}
+            for color in all_colors:
+                name = color["name"]
+                color_counts[name] = color_counts.get(name, 0) + 1
+            aggregated["color_frequency"] = color_counts
+        return aggregated
+    def analyze(
+        self,
+        file_path: Union[str, Path],
+        **kwargs
+    ) -> Dict[str, Any]:
+        """
+        Automatically detect file type and analyze.
+        Args:
+            file_path: Path to image or video file
+            **kwargs: Additional arguments
+        Returns:
+            Analysis results
+        """
+        file_path = Path(file_path)
+        # Detect file type
+        ext = file_path.suffix.lower()
+        if ext in config.ALLOWED_IMAGE_FORMATS:
+            return self.analyze_image(file_path, **kwargs)
+        elif ext in config.ALLOWED_VIDEO_FORMATS:
+            return self.analyze_video(file_path, **kwargs)
+        else:
+            raise ValueError(f"Unsupported file format: {ext}")
+    def __repr__(self) -> str:
+        """Object representation."""
+        return (f"AnalysisEngine(plugins={len(self.plugins)}, "
+                f"registered={self.get_registered_plugins()})")

core/exceptions.py ADDED Viewed

	@@ -0,0 +1,100 @@

+"""
+Custom exceptions for DeepVision Core Engine.
+Defines all custom exceptions used throughout the core engine
+for better error handling and debugging.
+"""
+class DeepVisionError(Exception):
+    """Base exception for all DeepVision errors."""
+    def __init__(self, message: str, details: dict = None):
+        self.message = message
+        self.details = details or {}
+        super().__init__(self.message)
+class FileProcessingError(DeepVisionError):
+    """Raised when file processing fails."""
+    pass
+class InvalidFileError(FileProcessingError):
+    """Raised when file is invalid or corrupted."""
+    pass
+class FileSizeError(FileProcessingError):
+    """Raised when file size exceeds limits."""
+    pass
+class UnsupportedFormatError(FileProcessingError):
+    """Raised when file format is not supported."""
+    pass
+class ImageProcessingError(DeepVisionError):
+    """Raised when image processing fails."""
+    pass
+class VideoProcessingError(DeepVisionError):
+    """Raised when video processing fails."""
+    pass
+class FrameExtractionError(VideoProcessingError):
+    """Raised when frame extraction from video fails."""
+    pass
+class ModelError(DeepVisionError):
+    """Raised when model operations fail."""
+    pass
+class ModelLoadError(ModelError):
+    """Raised when model loading fails."""
+    pass
+class ModelInferenceError(ModelError):
+    """Raised when model inference fails."""
+    pass
+class PluginError(DeepVisionError):
+    """Raised when plugin operations fail."""
+    pass
+class PluginLoadError(PluginError):
+    """Raised when plugin loading fails."""
+    pass
+class PluginExecutionError(PluginError):
+    """Raised when plugin execution fails."""
+    pass
+class ValidationError(DeepVisionError):
+    """Raised when validation fails."""
+    pass
+class ConfigurationError(DeepVisionError):
+    """Raised when configuration is invalid."""
+    pass
+class CacheError(DeepVisionError):
+    """Raised when cache operations fail."""
+    pass
+class ResultError(DeepVisionError):
+    """Raised when result processing fails."""
+    pass

core/image_processor.py ADDED Viewed

	@@ -0,0 +1,279 @@

+"""
+Image Processor Module
+Handles all image processing operations including loading, validation,
+resizing, normalization, and format conversion.
+"""
+import hashlib
+import magic
+from pathlib import Path
+from typing import Tuple, Optional, Union
+import numpy as np
+from PIL import Image, ImageOps
+from loguru import logger
+from core.config import config
+from core.exceptions import (
+    ImageProcessingError,
+    InvalidFileError,
+    FileSizeError,
+    UnsupportedFormatError,
+)
+class ImageProcessor:
+    """
+    Process images for analysis.
+    Handles validation, resizing, normalization, and format conversion
+    for images before they are passed to AI models.
+    """
+    def __init__(self):
+        """Initialize ImageProcessor."""
+        self.max_size = config.MAX_IMAGE_SIZE
+        self.max_dimension = config.IMAGE_MAX_DIMENSION
+        self.allowed_formats = config.ALLOWED_IMAGE_FORMATS
+        logger.info("ImageProcessor initialized")
+    def load_image(self, image_path: Union[str, Path]) -> Image.Image:
+        """
+        Load image from file path.
+        Args:
+            image_path: Path to image file
+        Returns:
+            PIL Image object
+        Raises:
+            InvalidFileError: If image cannot be loaded
+        """
+        try:
+            image_path = Path(image_path)
+            if not image_path.exists():
+                raise InvalidFileError(
+                    f"Image file not found: {image_path}",
+                    {"path": str(image_path)}
+                )
+            # Validate file
+            self.validate_image(image_path)
+            # Load image
+            image = Image.open(image_path)
+            # Convert to RGB if necessary
+            if image.mode != "RGB":
+                image = image.convert("RGB")
+            logger.info(f"Loaded image: {image_path.name} ({image.size})")
+            return image
+        except Exception as e:
+            logger.error(f"Failed to load image: {e}")
+            raise InvalidFileError(
+                f"Cannot load image: {str(e)}",
+                {"path": str(image_path), "error": str(e)}
+            )
+    def validate_image(self, image_path: Path) -> bool:
+        """
+        Validate image file.
+        Args:
+            image_path: Path to image file
+        Returns:
+            True if valid
+        Raises:
+            FileSizeError: If file too large
+            UnsupportedFormatError: If format not supported
+            InvalidFileError: If file is corrupted
+        """
+        # Check file size
+        file_size = image_path.stat().st_size
+        if file_size > self.max_size:
+            raise FileSizeError(
+                f"Image too large: {file_size / 1024 / 1024:.1f}MB",
+                {"max_size": self.max_size, "actual_size": file_size}
+            )
+        # Check file extension
+        ext = image_path.suffix.lower()
+        if ext not in self.allowed_formats:
+            raise UnsupportedFormatError(
+                f"Unsupported image format: {ext}",
+                {"allowed": self.allowed_formats, "received": ext}
+            )
+        # Check MIME type using magic bytes
+        try:
+            mime = magic.from_file(str(image_path), mime=True)
+            if not mime.startswith("image/"):
+                raise InvalidFileError(
+                    f"File is not a valid image: {mime}",
+                    {"mime_type": mime}
+                )
+        except Exception as e:
+            logger.warning(f"Could not verify MIME type: {e}")
+        return True
+    def resize_image(
+        self,
+        image: Image.Image,
+        max_size: Optional[Tuple[int, int]] = None,
+        maintain_aspect_ratio: bool = True
+    ) -> Image.Image:
+        """
+        Resize image to specified dimensions.
+        Args:
+            image: PIL Image object
+            max_size: Maximum (width, height) tuple
+            maintain_aspect_ratio: Whether to maintain aspect ratio
+        Returns:
+            Resized PIL Image
+        """
+        if max_size is None:
+            max_size = config.DEFAULT_IMAGE_SIZE
+        original_size = image.size
+        if maintain_aspect_ratio:
+            # Calculate new size maintaining aspect ratio
+            image.thumbnail(max_size, Image.Resampling.LANCZOS)
+        else:
+            # Resize to exact dimensions
+            image = image.resize(max_size, Image.Resampling.LANCZOS)
+        logger.debug(f"Resized image: {original_size} -> {image.size}")
+        return image
+    def normalize_image(self, image: Image.Image) -> np.ndarray:
+        """
+        Normalize image to numpy array with values [0, 1].
+        Args:
+            image: PIL Image object
+        Returns:
+            Normalized numpy array (H, W, C)
+        """
+        # Convert to numpy array
+        img_array = np.array(image, dtype=np.float32)
+        # Normalize to [0, 1]
+        img_array = img_array / 255.0
+        logger.debug(f"Normalized image to shape: {img_array.shape}")
+        return img_array
+    def apply_exif_orientation(self, image: Image.Image) -> Image.Image:
+        """
+        Apply EXIF orientation to image.
+        Args:
+            image: PIL Image object
+        Returns:
+            Oriented PIL Image
+        """
+        try:
+            image = ImageOps.exif_transpose(image)
+            logger.debug("Applied EXIF orientation")
+        except Exception as e:
+            logger.warning(f"Could not apply EXIF orientation: {e}")
+        return image
+    def get_image_hash(self, image_path: Path) -> str:
+        """
+        Generate SHA256 hash of image file.
+        Args:
+            image_path: Path to image file
+        Returns:
+            Hex string of hash
+        """
+        sha256_hash = hashlib.sha256()
+        with open(image_path, "rb") as f:
+            # Read in chunks to handle large files
+            for chunk in iter(lambda: f.read(8192), b""):
+                sha256_hash.update(chunk)
+        return sha256_hash.hexdigest()
+    def process(
+        self,
+        image_path: Union[str, Path],
+        resize: bool = True,
+        normalize: bool = False,
+        apply_orientation: bool = True
+    ) -> Union[Image.Image, np.ndarray]:
+        """
+        Complete image processing pipeline.
+        Args:
+            image_path: Path to image file
+            resize: Whether to resize image
+            normalize: Whether to normalize to numpy array
+            apply_orientation: Whether to apply EXIF orientation
+        Returns:
+            Processed image (PIL Image or numpy array)
+        """
+        try:
+            # Load image
+            image = self.load_image(image_path)
+            # Apply EXIF orientation
+            if apply_orientation:
+                image = self.apply_exif_orientation(image)
+            # Resize if needed
+            if resize:
+                image = self.resize_image(image)
+            # Normalize if needed
+            if normalize:
+                return self.normalize_image(image)
+            return image
+        except Exception as e:
+            logger.error(f"Image processing failed: {e}")
+            raise ImageProcessingError(
+                f"Failed to process image: {str(e)}",
+                {"path": str(image_path), "error": str(e)}
+            )
+    def get_image_info(self, image_path: Union[str, Path]) -> dict:
+        """
+        Get information about an image.
+        Args:
+            image_path: Path to image file
+        Returns:
+            Dictionary with image information
+        """
+        image_path = Path(image_path)
+        image = self.load_image(image_path)
+        return {
+            "filename": image_path.name,
+            "format": image.format,
+            "mode": image.mode,
+            "size": image.size,
+            "width": image.size[0],
+            "height": image.size[1],
+            "file_size": image_path.stat().st_size,
+            "hash": self.get_image_hash(image_path),
+        }

core/logging_config.py ADDED Viewed

	@@ -0,0 +1,71 @@

+"""
+Logging Configuration
+Setup logging for DeepVision using loguru.
+"""
+import sys
+from pathlib import Path
+from loguru import logger
+from core.config import config
+def setup_logging(
+    log_level: str = "INFO",
+    log_file: Path = None,
+    rotation: str = "10 MB",
+    retention: str = "1 week"
+) -> None:
+    """
+    Setup logging configuration.
+    Args:
+        log_level: Logging level (DEBUG, INFO, WARNING, ERROR)
+        log_file: Path to log file (None for no file logging)
+        rotation: Log rotation size/time
+        retention: How long to keep old logs
+    """
+    # Remove default handler
+    logger.remove()
+    # Console handler
+    logger.add(
+        sys.stderr,
+        format="<green>{time:YYYY-MM-DD HH:mm:ss}</green> | "
+               "<level>{level: <8}</level> | "
+               "<cyan>{name}</cyan>:<cyan>{function}</cyan> - "
+               "<level>{message}</level>",
+        level=log_level,
+        colorize=True,
+    )
+    # File handler (if specified)
+    if log_file:
+        log_file = Path(log_file)
+        log_file.parent.mkdir(parents=True, exist_ok=True)
+        logger.add(
+            log_file,
+            format="{time:YYYY-MM-DD HH:mm:ss} | {level: <8} | "
+                   "{name}:{function} - {message}",
+            level=log_level,
+            rotation=rotation,
+            retention=retention,
+            compression="zip",
+        )
+    logger.info(f"Logging configured - Level: {log_level}")
+# Auto-configure if imported
+if config.DEBUG:
+    setup_logging(
+        log_level="DEBUG",
+        log_file=config.BASE_DIR / "logs" / "deepvision.log"
+    )
+else:
+    setup_logging(
+        log_level="INFO",
+        log_file=config.BASE_DIR / "logs" / "deepvision.log"
+    )

core/result_manager.py ADDED Viewed

	@@ -0,0 +1,347 @@

+"""
+Result Manager Module
+Manages and aggregates results from multiple plugins,
+generates final JSON output with metadata.
+"""
+import json
+from datetime import datetime
+from pathlib import Path
+from typing import Dict, List, Any, Optional, Union
+from loguru import logger
+from core.config import config
+from core.exceptions import ResultError
+class ResultManager:
+    """
+    Manage and aggregate analysis results.
+    Collects results from multiple plugins, merges them,
+    and generates structured JSON output.
+    """
+    def __init__(self):
+        """Initialize ResultManager."""
+        self.results: Dict[str, Any] = {}
+        self.metadata: Dict[str, Any] = {}
+        logger.info("ResultManager initialized")
+    def add_result(
+        self,
+        plugin_name: str,
+        result: Dict[str, Any],
+        merge: bool = False
+    ) -> None:
+        """
+        Add result from a plugin.
+        Args:
+            plugin_name: Name of the plugin
+            result: Result dictionary from plugin
+            merge: Whether to merge with existing results
+        """
+        if merge and plugin_name in self.results:
+            # Merge with existing results
+            self.results[plugin_name] = self._merge_dicts(
+                self.results[plugin_name],
+                result
+            )
+        else:
+            # Replace existing results
+            self.results[plugin_name] = result
+        logger.debug(f"Added result from plugin: {plugin_name}")
+    def add_metadata(self, metadata: Dict[str, Any]) -> None:
+        """
+        Add metadata to results.
+        Args:
+            metadata: Metadata dictionary
+        """
+        self.metadata.update(metadata)
+        logger.debug(f"Added metadata: {list(metadata.keys())}")
+    def set_file_info(
+        self,
+        filename: str,
+        file_type: str,
+        file_size: int,
+        **kwargs
+    ) -> None:
+        """
+        Set file information in metadata.
+        Args:
+            filename: Name of the file
+            file_type: Type of file (image/video)
+            file_size: Size of file in bytes
+            **kwargs: Additional file information
+        """
+        self.metadata["file"] = {
+            "filename": filename,
+            "type": file_type,
+            "size": file_size,
+            "size_mb": round(file_size / 1024 / 1024, 2),
+            **kwargs
+        }
+    def set_processing_info(
+        self,
+        start_time: datetime,
+        end_time: datetime,
+        plugins_used: List[str]
+    ) -> None:
+        """
+        Set processing information in metadata.
+        Args:
+            start_time: Processing start time
+            end_time: Processing end time
+            plugins_used: List of plugin names used
+        """
+        duration = (end_time - start_time).total_seconds()
+        self.metadata["processing"] = {
+            "start_time": start_time.isoformat(),
+            "end_time": end_time.isoformat(),
+            "duration_seconds": round(duration, 3),
+            "plugins_used": plugins_used,
+            "plugin_count": len(plugins_used),
+        }
+    def _merge_dicts(self, dict1: Dict, dict2: Dict) -> Dict:
+        """
+        Deep merge two dictionaries.
+        Args:
+            dict1: First dictionary
+            dict2: Second dictionary
+        Returns:
+            Merged dictionary
+        """
+        result = dict1.copy()
+        for key, value in dict2.items():
+            if key in result and isinstance(result[key], dict) and isinstance(value, dict):
+                result[key] = self._merge_dicts(result[key], value)
+            elif key in result and isinstance(result[key], list) and isinstance(value, list):
+                result[key].extend(value)
+            else:
+                result[key] = value
+        return result
+    def merge_results(self, results_list: List[Dict[str, Any]]) -> Dict[str, Any]:
+        """
+        Merge multiple result dictionaries.
+        Args:
+            results_list: List of result dictionaries
+        Returns:
+            Merged dictionary
+        """
+        merged = {}
+        for result in results_list:
+            merged = self._merge_dicts(merged, result)
+        return merged
+    def get_result(self, plugin_name: Optional[str] = None) -> Union[Dict, Any]:
+        """
+        Get result from specific plugin or all results.
+        Args:
+            plugin_name: Name of plugin (None for all results)
+        Returns:
+            Result dictionary or specific plugin result
+        """
+        if plugin_name is None:
+            return self.results
+        return self.results.get(plugin_name)
+    def to_dict(self, include_metadata: bool = True) -> Dict[str, Any]:
+        """
+        Convert results to dictionary.
+        Args:
+            include_metadata: Whether to include metadata
+        Returns:
+            Complete results dictionary
+        """
+        output = {
+            "results": self.results,
+        }
+        if include_metadata and self.metadata:
+            output["metadata"] = self.metadata
+        # Add timestamp if not present
+        if "timestamp" not in output.get("metadata", {}):
+            if "metadata" not in output:
+                output["metadata"] = {}
+            output["metadata"]["timestamp"] = datetime.now().isoformat()
+        # Add version
+        output["metadata"]["version"] = config.APP_VERSION
+        return output
+    def to_json(
+        self,
+        include_metadata: bool = True,
+        pretty: bool = None,
+        ensure_ascii: bool = False
+    ) -> str:
+        """
+        Convert results to JSON string.
+        Args:
+            include_metadata: Whether to include metadata
+            pretty: Whether to format JSON (None uses config)
+            ensure_ascii: Whether to escape non-ASCII characters
+        Returns:
+            JSON string
+        """
+        if pretty is None:
+            pretty = config.PRETTY_JSON
+        data = self.to_dict(include_metadata=include_metadata)
+        if pretty:
+            json_str = json.dumps(
+                data,
+                indent=2,
+                ensure_ascii=ensure_ascii,
+                default=str
+            )
+        else:
+            json_str = json.dumps(
+                data,
+                ensure_ascii=ensure_ascii,
+                default=str
+            )
+        return json_str
+    def save_json(
+        self,
+        output_path: Union[str, Path],
+        include_metadata: bool = True,
+        pretty: bool = None
+    ) -> None:
+        """
+        Save results to JSON file.
+        Args:
+            output_path: Path to output file
+            include_metadata: Whether to include metadata
+            pretty: Whether to format JSON
+        """
+        try:
+            output_path = Path(output_path)
+            output_path.parent.mkdir(parents=True, exist_ok=True)
+            json_str = self.to_json(
+                include_metadata=include_metadata,
+                pretty=pretty
+            )
+            output_path.write_text(json_str, encoding="utf-8")
+            logger.info(f"Saved results to: {output_path}")
+        except Exception as e:
+            logger.error(f"Failed to save JSON: {e}")
+            raise ResultError(
+                f"Cannot save results to file: {str(e)}",
+                {"path": str(output_path), "error": str(e)}
+            )
+    def generate_prompt(self) -> str:
+        """
+        Generate a text prompt from results.
+        Returns:
+            Generated prompt string
+        """
+        prompt_parts = []
+        # Add captions
+        if "caption_generator" in self.results:
+            caption = self.results["caption_generator"].get("caption", "")
+            if caption:
+                prompt_parts.append(caption)
+        # Add objects
+        if "object_detector" in self.results:
+            objects = self.results["object_detector"].get("objects", [])
+            if objects:
+                object_names = [obj["name"] for obj in objects[:5]]
+                prompt_parts.append(f"showing {', '.join(object_names)}")
+        # Add colors
+        if "color_analyzer" in self.results:
+            colors = self.results["color_analyzer"].get("dominant_colors", [])
+            if colors:
+                color_names = [c["name"] for c in colors[:3]]
+                prompt_parts.append(f"with {', '.join(color_names)} colors")
+        # Add text
+        if "text_extractor" in self.results:
+            text = self.results["text_extractor"].get("text", "")
+            if text:
+                prompt_parts.append(f'containing text "{text[:50]}"')
+        prompt = ", ".join(prompt_parts)
+        return prompt.capitalize() if prompt else "No description available"
+    def get_summary(self) -> Dict[str, Any]:
+        """
+        Get summary of results.
+        Returns:
+            Summary dictionary
+        """
+        summary = {
+            "total_plugins": len(self.results),
+            "plugins": list(self.results.keys()),
+        }
+        # Add plugin-specific summaries
+        for plugin_name, result in self.results.items():
+            if plugin_name == "object_detector":
+                summary["object_count"] = len(result.get("objects", []))
+            elif plugin_name == "caption_generator":
+                summary["has_caption"] = bool(result.get("caption"))
+            elif plugin_name == "color_analyzer":
+                summary["color_count"] = len(result.get("dominant_colors", []))
+            elif plugin_name == "text_extractor":
+                summary["has_text"] = bool(result.get("text"))
+        return summary
+    def clear(self) -> None:
+        """Clear all results and metadata."""
+        self.results.clear()
+        self.metadata.clear()
+        logger.debug("Cleared all results")
+    def __str__(self) -> str:
+        """String representation."""
+        return self.to_json(pretty=True)
+    def __repr__(self) -> str:
+        """Object representation."""
+        return f"ResultManager(plugins={len(self.results)}, metadata={len(self.metadata)})"

core/video_processor.py ADDED Viewed

	@@ -0,0 +1,333 @@

+"""
+Video Processor Module
+Handles all video processing operations including frame extraction,
+validation, and video metadata extraction.
+"""
+import subprocess
+from pathlib import Path
+from typing import List, Optional, Union, Tuple
+import cv2
+import magic
+from loguru import logger
+from core.config import config
+from core.exceptions import (
+    VideoProcessingError,
+    InvalidFileError,
+    FileSizeError,
+    UnsupportedFormatError,
+    FrameExtractionError,
+)
+from core.image_processor import ImageProcessor
+class VideoProcessor:
+    """
+    Process videos for analysis.
+    Handles validation, frame extraction, and metadata extraction
+    for videos before they are analyzed.
+    """
+    def __init__(self):
+        """Initialize VideoProcessor."""
+        self.max_size = config.MAX_VIDEO_SIZE
+        self.allowed_formats = config.ALLOWED_VIDEO_FORMATS
+        self.fps_extraction = config.VIDEO_FPS_EXTRACTION
+        self.max_frames = config.MAX_FRAMES_PER_VIDEO
+        self.image_processor = ImageProcessor()
+        logger.info("VideoProcessor initialized")
+    def validate_video(self, video_path: Path) -> bool:
+        """
+        Validate video file.
+        Args:
+            video_path: Path to video file
+        Returns:
+            True if valid
+        Raises:
+            FileSizeError: If file too large
+            UnsupportedFormatError: If format not supported
+            InvalidFileError: If file is corrupted
+        """
+        # Check file exists
+        if not video_path.exists():
+            raise InvalidFileError(
+                f"Video file not found: {video_path}",
+                {"path": str(video_path)}
+            )
+        # Check file size
+        file_size = video_path.stat().st_size
+        if file_size > self.max_size:
+            raise FileSizeError(
+                f"Video too large: {file_size / 1024 / 1024:.1f}MB",
+                {"max_size": self.max_size, "actual_size": file_size}
+            )
+        # Check file extension
+        ext = video_path.suffix.lower()
+        if ext not in self.allowed_formats:
+            raise UnsupportedFormatError(
+                f"Unsupported video format: {ext}",
+                {"allowed": self.allowed_formats, "received": ext}
+            )
+        # Check MIME type using magic bytes
+        try:
+            mime = magic.from_file(str(video_path), mime=True)
+            if not mime.startswith("video/"):
+                raise InvalidFileError(
+                    f"File is not a valid video: {mime}",
+                    {"mime_type": mime}
+                )
+        except Exception as e:
+            logger.warning(f"Could not verify MIME type: {e}")
+        return True
+    def get_video_info(self, video_path: Union[str, Path]) -> dict:
+        """
+        Get video metadata using OpenCV.
+        Args:
+            video_path: Path to video file
+        Returns:
+            Dictionary with video information
+        """
+        video_path = Path(video_path)
+        self.validate_video(video_path)
+        try:
+            cap = cv2.VideoCapture(str(video_path))
+            if not cap.isOpened():
+                raise InvalidFileError(
+                    "Cannot open video file",
+                    {"path": str(video_path)}
+                )
+            # Extract metadata
+            fps = cap.get(cv2.CAP_PROP_FPS)
+            frame_count = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
+            width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
+            height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
+            duration = frame_count / fps if fps > 0 else 0
+            cap.release()
+            info = {
+                "filename": video_path.name,
+                "fps": fps,
+                "frame_count": frame_count,
+                "width": width,
+                "height": height,
+                "duration": duration,
+                "file_size": video_path.stat().st_size,
+            }
+            logger.info(f"Video info: {video_path.name} - {width}x{height}, "
+                       f"{fps:.2f}fps, {duration:.2f}s")
+            return info
+        except Exception as e:
+            logger.error(f"Failed to get video info: {e}")
+            raise VideoProcessingError(
+                f"Cannot extract video metadata: {str(e)}",
+                {"path": str(video_path), "error": str(e)}
+            )
+    def extract_frames(
+        self,
+        video_path: Union[str, Path],
+        fps: Optional[float] = None,
+        max_frames: Optional[int] = None,
+        output_dir: Optional[Path] = None
+    ) -> List[Path]:
+        """
+        Extract frames from video at specified FPS.
+        Args:
+            video_path: Path to video file
+            fps: Frames per second to extract (default: config.VIDEO_FPS_EXTRACTION)
+            max_frames: Maximum number of frames to extract
+            output_dir: Directory to save frames (default: cache directory)
+        Returns:
+            List of paths to extracted frames
+        Raises:
+            FrameExtractionError: If frame extraction fails
+        """
+        video_path = Path(video_path)
+        self.validate_video(video_path)
+        if fps is None:
+            fps = self.fps_extraction
+        if max_frames is None:
+            max_frames = self.max_frames
+        if output_dir is None:
+            output_dir = config.CACHE_DIR / "frames" / video_path.stem
+        output_dir.mkdir(parents=True, exist_ok=True)
+        try:
+            cap = cv2.VideoCapture(str(video_path))
+            if not cap.isOpened():
+                raise FrameExtractionError(
+                    "Cannot open video file",
+                    {"path": str(video_path)}
+                )
+            video_fps = cap.get(cv2.CAP_PROP_FPS)
+            frame_count = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
+            # Calculate frame interval
+            frame_interval = int(video_fps / fps) if fps < video_fps else 1
+            frames_saved = []
+            frame_idx = 0
+            saved_count = 0
+            logger.info(f"Extracting frames from {video_path.name} "
+                       f"(fps={fps}, interval={frame_interval})")
+            while True:
+                ret, frame = cap.read()
+                if not ret:
+                    break
+                # Extract frame at specified interval
+                if frame_idx % frame_interval == 0:
+                    # Save frame
+                    frame_path = output_dir / f"frame_{saved_count:04d}.jpg"
+                    cv2.imwrite(str(frame_path), frame)
+                    frames_saved.append(frame_path)
+                    saved_count += 1
+                    # Check if we've reached max frames
+                    if saved_count >= max_frames:
+                        logger.info(f"Reached max frames limit: {max_frames}")
+                        break
+                frame_idx += 1
+            cap.release()
+            logger.info(f"Extracted {len(frames_saved)} frames from {video_path.name}")
+            return frames_saved
+        except Exception as e:
+            logger.error(f"Frame extraction failed: {e}")
+            raise FrameExtractionError(
+                f"Failed to extract frames: {str(e)}",
+                {"path": str(video_path), "error": str(e)}
+            )
+    def extract_key_frames(
+        self,
+        video_path: Union[str, Path],
+        num_frames: int = 5,
+        output_dir: Optional[Path] = None
+    ) -> List[Path]:
+        """
+        Extract evenly distributed key frames from video.
+        Args:
+            video_path: Path to video file
+            num_frames: Number of key frames to extract
+            output_dir: Directory to save frames
+        Returns:
+            List of paths to extracted frames
+        """
+        video_path = Path(video_path)
+        self.validate_video(video_path)
+        if output_dir is None:
+            output_dir = config.CACHE_DIR / "keyframes" / video_path.stem
+        output_dir.mkdir(parents=True, exist_ok=True)
+        try:
+            cap = cv2.VideoCapture(str(video_path))
+            if not cap.isOpened():
+                raise FrameExtractionError(
+                    "Cannot open video file",
+                    {"path": str(video_path)}
+                )
+            frame_count = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
+            # Calculate frame positions
+            positions = [int(i * frame_count / (num_frames + 1))
+                        for i in range(1, num_frames + 1)]
+            frames_saved = []
+            for idx, pos in enumerate(positions):
+                cap.set(cv2.CAP_PROP_POS_FRAMES, pos)
+                ret, frame = cap.read()
+                if ret:
+                    frame_path = output_dir / f"keyframe_{idx:02d}.jpg"
+                    cv2.imwrite(str(frame_path), frame)
+                    frames_saved.append(frame_path)
+            cap.release()
+            logger.info(f"Extracted {len(frames_saved)} key frames from {video_path.name}")
+            return frames_saved
+        except Exception as e:
+            logger.error(f"Key frame extraction failed: {e}")
+            raise FrameExtractionError(
+                f"Failed to extract key frames: {str(e)}",
+                {"path": str(video_path), "error": str(e)}
+            )
+    def process(
+        self,
+        video_path: Union[str, Path],
+        extract_method: str = "fps",
+        **kwargs
+    ) -> List[Path]:
+        """
+        Complete video processing pipeline.
+        Args:
+            video_path: Path to video file
+            extract_method: Method for frame extraction ("fps" or "keyframes")
+            **kwargs: Additional arguments for extraction method
+        Returns:
+            List of extracted frame paths
+        """
+        try:
+            if extract_method == "fps":
+                return self.extract_frames(video_path, **kwargs)
+            elif extract_method == "keyframes":
+                return self.extract_key_frames(video_path, **kwargs)
+            else:
+                raise ValueError(f"Unknown extraction method: {extract_method}")
+        except Exception as e:
+            logger.error(f"Video processing failed: {e}")
+            raise VideoProcessingError(
+                f"Failed to process video: {str(e)}",
+                {"path": str(video_path), "error": str(e)}
+            )

plugins/__init__.py ADDED Viewed

	@@ -0,0 +1,16 @@

+"""
+DeepVision Plugins Package
+Plugin system for modular analysis capabilities.
+"""
+__version__ = "0.1.0"
+from plugins.base import BasePlugin, PluginMetadata
+from plugins.loader import PluginLoader
+__all__ = [
+    "BasePlugin",
+    "PluginMetadata",
+    "PluginLoader",
+]

plugins/base.py ADDED Viewed

	@@ -0,0 +1,170 @@

+"""
+Base Plugin Class
+Defines the interface that all plugins must implement.
+"""
+from abc import ABC, abstractmethod
+from dataclasses import dataclass
+from typing import Dict, Any, Optional, List
+from pathlib import Path
+from PIL import Image
+import numpy as np
+from loguru import logger
+@dataclass
+class PluginMetadata:
+    """Metadata for a plugin."""
+    name: str
+    version: str
+    description: str
+    author: str
+    requires: List[str] = None  # Required dependencies
+    category: str = "general"  # Plugin category
+    enabled: bool = True
+    priority: int = 50  # Execution priority (lower = earlier)
+    def __post_init__(self):
+        if self.requires is None:
+            self.requires = []
+class BasePlugin(ABC):
+    """
+    Base class for all DeepVision plugins.
+    All plugins must inherit from this class and implement
+    the analyze() method.
+    """
+    def __init__(self):
+        """Initialize plugin."""
+        self._metadata: Optional[PluginMetadata] = None
+        self._initialized = False
+        self._enabled = True
+        logger.debug(f"Plugin {self.__class__.__name__} created")
+    @property
+    @abstractmethod
+    def metadata(self) -> PluginMetadata:
+        """
+        Return plugin metadata.
+        Returns:
+            PluginMetadata instance
+        """
+        pass
+    @abstractmethod
+    def initialize(self) -> None:
+        """
+        Initialize the plugin.
+        This method is called when the plugin is loaded.
+        Use it to load models, initialize resources, etc.
+        """
+        pass
+    @abstractmethod
+    def analyze(
+        self,
+        media: Any,
+        media_path: Path
+    ) -> Dict[str, Any]:
+        """
+        Analyze media and return results.
+        Args:
+            media: Processed media (PIL Image or numpy array)
+            media_path: Path to the media file
+        Returns:
+            Dictionary with analysis results
+        """
+        pass
+    def cleanup(self) -> None:
+        """
+        Clean up resources when plugin is unloaded.
+        Override this method to release resources like
+        model memory, file handles, etc.
+        """
+        logger.debug(f"Cleaning up plugin {self.metadata.name}")
+    def validate_input(self, media: Any) -> bool:
+        """
+        Validate input media.
+        Args:
+            media: Media to validate
+        Returns:
+            True if valid, False otherwise
+        """
+        if isinstance(media, Image.Image):
+            return True
+        elif isinstance(media, np.ndarray):
+            return True
+        else:
+            logger.warning(
+                f"Plugin {self.metadata.name} received unsupported media type: "
+                f"{type(media)}"
+            )
+            return False
+    def get_config(self) -> Dict[str, Any]:
+        """
+        Get plugin configuration.
+        Returns:
+            Configuration dictionary
+        """
+        return {
+            "name": self.metadata.name,
+            "version": self.metadata.version,
+            "enabled": self._enabled,
+            "initialized": self._initialized,
+        }
+    def set_enabled(self, enabled: bool) -> None:
+        """
+        Enable or disable the plugin.
+        Args:
+            enabled: True to enable, False to disable
+        """
+        self._enabled = enabled
+        logger.info(
+            f"Plugin {self.metadata.name} "
+            f"{'enabled' if enabled else 'disabled'}"
+        )
+    def is_enabled(self) -> bool:
+        """
+        Check if plugin is enabled.
+        Returns:
+            True if enabled
+        """
+        return self._enabled
+    def is_initialized(self) -> bool:
+        """
+        Check if plugin is initialized.
+        Returns:
+            True if initialized
+        """
+        return self._initialized
+    def __repr__(self) -> str:
+        """String representation."""
+        return (
+            f"{self.__class__.__name__}("
+            f"name={self.metadata.name}, "
+            f"version={self.metadata.version}, "
+            f"enabled={self._enabled})"
+        )

plugins/caption_generator.py ADDED Viewed

	@@ -0,0 +1,206 @@

+"""
+Caption Generator Plugin
+Generates descriptive captions for images using BLIP-2.
+"""
+from typing import Dict, Any
+from pathlib import Path
+import numpy as np
+from PIL import Image
+from loguru import logger
+from plugins.base import BasePlugin, PluginMetadata
+class CaptionGeneratorPlugin(BasePlugin):
+    """
+    Generate captions for images using BLIP-2.
+    Creates natural language descriptions of image content.
+    """
+    def __init__(self):
+        """Initialize CaptionGeneratorPlugin."""
+        super().__init__()
+        self.model = None
+        self.processor = None
+        self.max_length = 50
+    @property
+    def metadata(self) -> PluginMetadata:
+        """Return plugin metadata."""
+        return PluginMetadata(
+            name="caption_generator",
+            version="0.1.0",
+            description="Generates image captions using BLIP-2",
+            author="AI Dev Collective",
+            requires=["transformers", "torch"],
+            category="captioning",
+            priority=20,
+        )
+    def initialize(self) -> None:
+        """Initialize the plugin and load BLIP-2 model."""
+        try:
+            # Import here to avoid loading if plugin is not used
+            from transformers import (
+                Blip2Processor,
+                Blip2ForConditionalGeneration
+            )
+            logger.info("Loading BLIP-2 model...")
+            # Use smaller BLIP-2 model for faster inference
+            model_name = "Salesforce/blip2-opt-2.7b"
+            # Load processor and model
+            self.processor = Blip2Processor.from_pretrained(model_name)
+            self.model = Blip2ForConditionalGeneration.from_pretrained(
+                model_name
+            )
+            # Set to eval mode
+            self.model.eval()
+            # Move to CPU (GPU support can be added later)
+            device = "cpu"
+            self.model.to(device)
+            self._initialized = True
+            logger.info(
+                f"BLIP-2 model loaded successfully on {device}"
+            )
+        except Exception as e:
+            logger.error(f"Failed to initialize CaptionGeneratorPlugin: {e}")
+            # Fallback: try smaller BLIP model
+            try:
+                logger.info("Trying smaller BLIP model...")
+                from transformers import BlipProcessor, BlipForConditionalGeneration
+                model_name = "Salesforce/blip-image-captioning-base"
+                self.processor = BlipProcessor.from_pretrained(model_name)
+                self.model = BlipForConditionalGeneration.from_pretrained(
+                    model_name
+                )
+                self.model.eval()
+                self.model.to("cpu")
+                self._initialized = True
+                logger.info("BLIP base model loaded successfully")
+            except Exception as fallback_error:
+                logger.error(f"Fallback also failed: {fallback_error}")
+                raise
+    def _generate_caption(
+        self,
+        image: Image.Image,
+        max_length: int = 50
+    ) -> str:
+        """
+        Generate caption for image.
+        Args:
+            image: PIL Image
+            max_length: Maximum caption length
+        Returns:
+            Generated caption string
+        """
+        import torch
+        # Prepare inputs
+        inputs = self.processor(
+            images=image,
+            return_tensors="pt"
+        )
+        # Generate caption
+        with torch.no_grad():
+            generated_ids = self.model.generate(
+                **inputs,
+                max_length=max_length,
+                num_beams=5,
+                early_stopping=True
+            )
+        # Decode caption
+        caption = self.processor.decode(
+            generated_ids[0],
+            skip_special_tokens=True
+        )
+        return caption.strip()
+    def analyze(
+        self,
+        media: Any,
+        media_path: Path
+    ) -> Dict[str, Any]:
+        """
+        Generate caption for the image.
+        Args:
+            media: PIL Image or numpy array
+            media_path: Path to image file
+        Returns:
+            Dictionary with caption
+        """
+        try:
+            # Check if initialized
+            if not self._initialized:
+                self.initialize()
+            # Validate input
+            if not self.validate_input(media):
+                return {"error": "Invalid input type"}
+            # Convert to PIL Image if numpy array
+            if isinstance(media, np.ndarray):
+                image = Image.fromarray(
+                    (media * 255).astype(np.uint8) if media.max() <= 1
+                    else media.astype(np.uint8)
+                )
+            else:
+                image = media
+            # Generate caption
+            caption = self._generate_caption(image, self.max_length)
+            # Analyze caption
+            word_count = len(caption.split())
+            result = {
+                "caption": caption,
+                "word_count": word_count,
+                "character_count": len(caption),
+                "max_length": self.max_length,
+                "status": "success",
+            }
+            logger.debug(f"Caption generated: '{caption[:50]}...'")
+            return result
+        except Exception as e:
+            logger.error(f"Caption generation failed: {e}")
+            return {
+                "error": str(e),
+                "status": "failed"
+            }
+    def cleanup(self) -> None:
+        """Clean up model resources."""
+        if self.model is not None:
+            del self.model
+            self.model = None
+        if self.processor is not None:
+            del self.processor
+            self.processor = None
+        logger.info("CaptionGeneratorPlugin cleanup complete")

plugins/color_analyzer.py ADDED Viewed

	@@ -0,0 +1,291 @@

+"""
+Color Analyzer Plugin
+Analyzes dominant colors in images.
+"""
+from typing import Dict, Any
+from pathlib import Path
+from collections import Counter
+import numpy as np
+from PIL import Image
+from loguru import logger
+from plugins.base import BasePlugin, PluginMetadata
+class ColorAnalyzerPlugin(BasePlugin):
+    """
+    Analyze dominant colors in an image.
+    Extracts the most prominent colors and provides
+    color information including RGB values and names.
+    """
+    def __init__(self):
+        """Initialize ColorAnalyzerPlugin."""
+        super().__init__()
+        self.num_colors = 5
+        self.color_names = self._load_color_names()
+    @property
+    def metadata(self) -> PluginMetadata:
+        """Return plugin metadata."""
+        return PluginMetadata(
+            name="color_analyzer",
+            version="0.1.0",
+            description="Analyzes dominant colors in images",
+            author="AI Dev Collective",
+            category="analysis",
+            priority=30,
+        )
+    def initialize(self) -> None:
+        """Initialize the plugin."""
+        logger.info("ColorAnalyzerPlugin initialized")
+        self._initialized = True
+    def _load_color_names(self) -> Dict[str, tuple]:
+        """
+        Load basic color names and their RGB values.
+        Returns:
+            Dictionary mapping color names to RGB tuples
+        """
+        return {
+            "red": (255, 0, 0),
+            "green": (0, 255, 0),
+            "blue": (0, 0, 255),
+            "yellow": (255, 255, 0),
+            "cyan": (0, 255, 255),
+            "magenta": (255, 0, 255),
+            "white": (255, 255, 255),
+            "black": (0, 0, 0),
+            "gray": (128, 128, 128),
+            "orange": (255, 165, 0),
+            "purple": (128, 0, 128),
+            "pink": (255, 192, 203),
+            "brown": (165, 42, 42),
+            "navy": (0, 0, 128),
+            "teal": (0, 128, 128),
+        }
+    def _get_color_name(self, rgb: tuple) -> str:
+        """
+        Get the closest color name for an RGB value.
+        Args:
+            rgb: RGB tuple (r, g, b)
+        Returns:
+            Color name string
+        """
+        min_distance = float('inf')
+        closest_name = "unknown"
+        r, g, b = rgb
+        for name, (cr, cg, cb) in self.color_names.items():
+            # Calculate Euclidean distance
+            distance = np.sqrt(
+                (r - cr) ** 2 + (g - cg) ** 2 + (b - cb) ** 2
+            )
+            if distance < min_distance:
+                min_distance = distance
+                closest_name = name
+        return closest_name
+    def _extract_dominant_colors(
+        self,
+        image: Image.Image,
+        num_colors: int = 5
+    ) -> list:
+        """
+        Extract dominant colors from image.
+        Args:
+            image: PIL Image
+            num_colors: Number of dominant colors to extract
+        Returns:
+            List of color information dictionaries
+        """
+        # Resize image for faster processing
+        img = image.copy()
+        img.thumbnail((150, 150))
+        # Convert to RGB if necessary
+        if img.mode != 'RGB':
+            img = img.convert('RGB')
+        # Get all pixels
+        pixels = np.array(img).reshape(-1, 3)
+        # Count color occurrences
+        pixel_tuples = [tuple(pixel) for pixel in pixels]
+        color_counts = Counter(pixel_tuples)
+        # Get most common colors
+        most_common = color_counts.most_common(num_colors)
+        total_pixels = len(pixel_tuples)
+        colors = []
+        for rgb, count in most_common:
+            percentage = (count / total_pixels) * 100
+            color_info = {
+                "rgb": [int(rgb[0]), int(rgb[1]), int(rgb[2])],
+                "hex": f"#{rgb[0]:02x}{rgb[1]:02x}{rgb[2]:02x}",
+                "name": self._get_color_name(rgb),
+                "percentage": float(round(percentage, 2)),
+                "count": int(count),
+            }
+            colors.append(color_info)
+        return colors
+    def _calculate_brightness(self, rgb: tuple) -> float:
+        """
+        Calculate brightness of a color.
+        Args:
+            rgb: RGB tuple
+        Returns:
+            Brightness value (0-255)
+        """
+        r, g, b = rgb
+        # Perceived brightness formula
+        return (0.299 * r + 0.587 * g + 0.114 * b)
+    def _calculate_saturation(self, rgb: tuple) -> float:
+        """
+        Calculate saturation of a color.
+        Args:
+            rgb: RGB tuple
+        Returns:
+            Saturation value (0-1)
+        """
+        r, g, b = [x / 255.0 for x in rgb]
+        max_val = max(r, g, b)
+        min_val = min(r, g, b)
+        if max_val == 0:
+            return 0
+        return (max_val - min_val) / max_val
+    def analyze(
+        self,
+        media: Any,
+        media_path: Path
+    ) -> Dict[str, Any]:
+        """
+        Analyze colors in the image.
+        Args:
+            media: PIL Image or numpy array
+            media_path: Path to image file
+        Returns:
+            Dictionary with color analysis results
+        """
+        try:
+            # Validate input
+            if not self.validate_input(media):
+                return {"error": "Invalid input type"}
+            # Convert to PIL Image if numpy array
+            if isinstance(media, np.ndarray):
+                image = Image.fromarray(
+                    (media * 255).astype(np.uint8) if media.max() <= 1
+                    else media.astype(np.uint8)
+                )
+            else:
+                image = media
+            # Extract dominant colors
+            dominant_colors = self._extract_dominant_colors(
+                image,
+                num_colors=self.num_colors
+            )
+            # Calculate average brightness and saturation
+            avg_brightness = np.mean([
+                self._calculate_brightness(tuple(c["rgb"]))
+                for c in dominant_colors
+            ])
+            avg_saturation = np.mean([
+                self._calculate_saturation(tuple(c["rgb"]))
+                for c in dominant_colors
+            ])
+            # Determine overall color scheme
+            color_scheme = self._determine_color_scheme(dominant_colors)
+            result = {
+                "dominant_colors": dominant_colors,
+                "total_colors_analyzed": int(len(dominant_colors)),
+                "average_brightness": float(round(avg_brightness, 2)),
+                "average_saturation": float(round(avg_saturation, 2)),
+                "color_scheme": color_scheme,
+                "status": "success",
+            }
+            logger.debug(
+                f"Color analysis complete: {len(dominant_colors)} colors found"
+            )
+            return result
+        except Exception as e:
+            logger.error(f"Color analysis failed: {e}")
+            return {
+                "error": str(e),
+                "status": "failed"
+            }
+    def _determine_color_scheme(self, colors: list) -> str:
+        """
+        Determine the overall color scheme.
+        Args:
+            colors: List of color dictionaries
+        Returns:
+            Color scheme description
+        """
+        if not colors:
+            return "unknown"
+        # Get color names
+        color_names = [c["name"] for c in colors]
+        # Check for monochrome (mostly gray/white/black)
+        grayscale = ["gray", "white", "black"]
+        if all(name in grayscale for name in color_names[:3]):
+            return "monochrome"
+        # Check for warm colors
+        warm = ["red", "orange", "yellow", "pink", "brown"]
+        warm_count = sum(1 for name in color_names if name in warm)
+        if warm_count >= len(color_names) * 0.6:
+            return "warm"
+        # Check for cool colors
+        cool = ["blue", "green", "cyan", "purple", "teal", "navy"]
+        cool_count = sum(1 for name in color_names if name in cool)
+        if cool_count >= len(color_names) * 0.6:
+            return "cool"
+        return "mixed"
+    def cleanup(self) -> None:
+        """Clean up resources."""
+        logger.info("ColorAnalyzerPlugin cleanup complete")

plugins/loader.py ADDED Viewed

	@@ -0,0 +1,318 @@

+"""
+Plugin Loader
+Dynamically loads and manages plugins.
+"""
+import importlib
+import inspect
+from pathlib import Path
+from typing import Dict, List, Type, Optional, Any
+from loguru import logger
+from plugins.base import BasePlugin, PluginMetadata
+from core.exceptions import PluginError, PluginLoadError
+class PluginLoader:
+    """
+    Load and manage plugins dynamically.
+    Discovers, loads, and manages plugin lifecycle.
+    """
+    def __init__(self, plugin_dir: Optional[Path] = None):
+        """
+        Initialize PluginLoader.
+        Args:
+            plugin_dir: Directory containing plugin modules
+        """
+        if plugin_dir is None:
+            plugin_dir = Path(__file__).parent
+        self.plugin_dir = Path(plugin_dir)
+        self.plugins: Dict[str, BasePlugin] = {}
+        self.plugin_classes: Dict[str, Type[BasePlugin]] = {}
+        logger.info(f"PluginLoader initialized with directory: {plugin_dir}")
+    def discover_plugins(self) -> List[str]:
+        """
+        Discover available plugins in the plugin directory.
+        Returns:
+            List of discovered plugin module names
+        """
+        discovered = []
+        # Look for Python files in the plugin directory
+        for file_path in self.plugin_dir.glob("*.py"):
+            # Skip __init__.py, base.py, loader.py
+            if file_path.stem in ["__init__", "base", "loader"]:
+                continue
+            module_name = file_path.stem
+            discovered.append(module_name)
+            logger.debug(f"Discovered plugin module: {module_name}")
+        logger.info(f"Discovered {len(discovered)} plugin modules")
+        return discovered
+    def load_plugin_class(self, module_name: str) -> Optional[Type[BasePlugin]]:
+        """
+        Load a plugin class from a module.
+        Args:
+            module_name: Name of the module to load
+        Returns:
+            Plugin class or None if not found
+        """
+        try:
+            # Import the module
+            module = importlib.import_module(f"plugins.{module_name}")
+            # Find all classes that inherit from BasePlugin
+            for name, obj in inspect.getmembers(module, inspect.isclass):
+                if (issubclass(obj, BasePlugin) and
+                    obj is not BasePlugin and
+                    obj.__module__ == module.__name__):
+                    logger.info(f"Loaded plugin class: {name} from {module_name}")
+                    return obj
+            logger.warning(f"No plugin class found in module: {module_name}")
+            return None
+        except Exception as e:
+            logger.error(f"Failed to load plugin module {module_name}: {e}")
+            raise PluginLoadError(
+                f"Cannot load plugin module {module_name}: {str(e)}",
+                {"module": module_name, "error": str(e)}
+            )
+    def load_plugin(
+        self,
+        plugin_name: str,
+        auto_initialize: bool = True
+    ) -> BasePlugin:
+        """
+        Load and optionally initialize a plugin.
+        Args:
+            plugin_name: Name of the plugin module
+            auto_initialize: Whether to automatically initialize the plugin
+        Returns:
+            Loaded plugin instance
+        """
+        try:
+            # Check if already loaded
+            if plugin_name in self.plugins:
+                logger.info(f"Plugin {plugin_name} already loaded")
+                return self.plugins[plugin_name]
+            # Load plugin class
+            plugin_class = self.load_plugin_class(plugin_name)
+            if plugin_class is None:
+                raise PluginLoadError(
+                    f"No plugin class found in {plugin_name}",
+                    {"plugin": plugin_name}
+                )
+            # Store plugin class
+            self.plugin_classes[plugin_name] = plugin_class
+            # Create instance
+            plugin_instance = plugin_class()
+            # Initialize if requested
+            if auto_initialize:
+                plugin_instance.initialize()
+                plugin_instance._initialized = True
+            # Store instance
+            self.plugins[plugin_instance.metadata.name] = plugin_instance
+            logger.info(
+                f"Plugin loaded: {plugin_instance.metadata.name} "
+                f"v{plugin_instance.metadata.version}"
+            )
+            return plugin_instance
+        except Exception as e:
+            logger.error(f"Failed to load plugin {plugin_name}: {e}")
+            raise PluginLoadError(
+                f"Cannot load plugin {plugin_name}: {str(e)}",
+                {"plugin": plugin_name, "error": str(e)}
+            )
+    def load_all_plugins(self, auto_initialize: bool = True) -> Dict[str, BasePlugin]:
+        """
+        Discover and load all available plugins.
+        Args:
+            auto_initialize: Whether to automatically initialize plugins
+        Returns:
+            Dictionary of loaded plugins
+        """
+        discovered = self.discover_plugins()
+        for module_name in discovered:
+            try:
+                self.load_plugin(module_name, auto_initialize=auto_initialize)
+            except Exception as e:
+                logger.error(f"Failed to load plugin {module_name}: {e}")
+                # Continue loading other plugins
+                continue
+        logger.info(f"Loaded {len(self.plugins)} plugins")
+        return self.plugins
+    def unload_plugin(self, plugin_name: str) -> None:
+        """
+        Unload a plugin and clean up resources.
+        Args:
+            plugin_name: Name of the plugin to unload
+        """
+        if plugin_name not in self.plugins:
+            logger.warning(f"Plugin {plugin_name} not loaded")
+            return
+        plugin = self.plugins[plugin_name]
+        # Clean up resources
+        try:
+            plugin.cleanup()
+        except Exception as e:
+            logger.error(f"Error during plugin cleanup: {e}")
+        # Remove from loaded plugins
+        del self.plugins[plugin_name]
+        logger.info(f"Plugin unloaded: {plugin_name}")
+    def unload_all_plugins(self) -> None:
+        """Unload all plugins."""
+        plugin_names = list(self.plugins.keys())
+        for plugin_name in plugin_names:
+            self.unload_plugin(plugin_name)
+        logger.info("All plugins unloaded")
+    def get_plugin(self, plugin_name: str) -> Optional[BasePlugin]:
+        """
+        Get a loaded plugin by name.
+        Args:
+            plugin_name: Name of the plugin
+        Returns:
+            Plugin instance or None
+        """
+        return self.plugins.get(plugin_name)
+    def list_plugins(self) -> List[PluginMetadata]:
+        """
+        List all loaded plugins.
+        Returns:
+            List of plugin metadata
+        """
+        return [plugin.metadata for plugin in self.plugins.values()]
+    def reload_plugin(self, plugin_name: str) -> BasePlugin:
+        """
+        Reload a plugin (unload then load).
+        Args:
+            plugin_name: Name of the plugin to reload
+        Returns:
+            Reloaded plugin instance
+        """
+        logger.info(f"Reloading plugin: {plugin_name}")
+        # Find the module name
+        module_name = None
+        for name, plugin in self.plugins.items():
+            if name == plugin_name:
+                module_name = plugin.__class__.__module__.split(".")[-1]
+                break
+        if module_name is None:
+            raise PluginError(
+                f"Plugin {plugin_name} not found",
+                {"plugin": plugin_name}
+            )
+        # Unload
+        self.unload_plugin(plugin_name)
+        # Reload module
+        importlib.reload(
+            importlib.import_module(f"plugins.{module_name}")
+        )
+        # Load again
+        return self.load_plugin(module_name)
+    def get_plugins_by_category(self, category: str) -> List[BasePlugin]:
+        """
+        Get all plugins in a specific category.
+        Args:
+            category: Plugin category
+        Returns:
+            List of plugins in the category
+        """
+        return [
+            plugin for plugin in self.plugins.values()
+            if plugin.metadata.category == category
+        ]
+    def get_enabled_plugins(self) -> List[BasePlugin]:
+        """
+        Get all enabled plugins.
+        Returns:
+            List of enabled plugins
+        """
+        return [
+            plugin for plugin in self.plugins.values()
+            if plugin.is_enabled()
+        ]
+    def get_plugin_info(self) -> Dict[str, Dict[str, Any]]:
+        """
+        Get information about all loaded plugins.
+        Returns:
+            Dictionary with plugin information
+        """
+        info = {}
+        for name, plugin in self.plugins.items():
+            info[name] = {
+                "name": plugin.metadata.name,
+                "version": plugin.metadata.version,
+                "description": plugin.metadata.description,
+                "author": plugin.metadata.author,
+                "category": plugin.metadata.category,
+                "enabled": plugin.is_enabled(),
+                "initialized": plugin.is_initialized(),
+                "priority": plugin.metadata.priority,
+            }
+        return info
+    def __repr__(self) -> str:
+        """String representation."""
+        return f"PluginLoader(plugins={len(self.plugins)})"

plugins/object_detector.py ADDED Viewed

	@@ -0,0 +1,258 @@

+"""
+Object Detector Plugin
+Detects objects in images using CLIP model.
+"""
+from typing import Dict, Any, List
+from pathlib import Path
+import numpy as np
+from PIL import Image
+from loguru import logger
+from plugins.base import BasePlugin, PluginMetadata
+class ObjectDetectorPlugin(BasePlugin):
+    """
+    Detect objects in images using CLIP.
+    Uses zero-shot classification to identify objects
+    without requiring training data.
+    """
+    def __init__(self):
+        """Initialize ObjectDetectorPlugin."""
+        super().__init__()
+        self.model = None
+        self.processor = None
+        self.candidate_labels = [
+            "person", "people", "man", "woman", "child", "baby",
+            "dog", "cat", "bird", "animal",
+            "car", "vehicle", "bicycle", "motorcycle",
+            "building", "house", "tree", "plant", "flower",
+            "food", "plate", "cup", "bottle",
+            "computer", "phone", "keyboard", "screen",
+            "furniture", "chair", "table", "bed",
+            "nature", "landscape", "mountain", "ocean", "beach",
+            "sky", "cloud", "sunset", "sunrise",
+            "indoor", "outdoor", "room", "street",
+        ]
+    @property
+    def metadata(self) -> PluginMetadata:
+        """Return plugin metadata."""
+        return PluginMetadata(
+            name="object_detector",
+            version="0.1.0",
+            description="Detects objects using CLIP zero-shot classification",
+            author="AI Dev Collective",
+            requires=["transformers", "torch"],
+            category="detection",
+            priority=10,
+        )
+    def initialize(self) -> None:
+        """Initialize the plugin and load CLIP model."""
+        try:
+            # Import here to avoid loading if plugin is not used
+            from transformers import CLIPProcessor, CLIPModel
+            import torch
+            logger.info("Loading CLIP model...")
+            model_name = "openai/clip-vit-base-patch32"
+            # Load model and processor
+            self.model = CLIPModel.from_pretrained(model_name)
+            self.processor = CLIPProcessor.from_pretrained(model_name)
+            # Set to eval mode
+            self.model.eval()
+            # Move to CPU (GPU support can be added later)
+            device = "cpu"
+            self.model.to(device)
+            self._initialized = True
+            logger.info(
+                f"CLIP model loaded successfully on {device}"
+            )
+        except Exception as e:
+            logger.error(f"Failed to initialize ObjectDetectorPlugin: {e}")
+            raise
+    def _detect_objects(
+        self,
+        image: Image.Image,
+        labels: List[str],
+        threshold: float = 0.3
+    ) -> List[Dict[str, Any]]:
+        """
+        Detect objects in image using CLIP.
+        Args:
+            image: PIL Image
+            labels: List of candidate labels
+            threshold: Confidence threshold
+        Returns:
+            List of detected objects
+        """
+        import torch
+        # Prepare inputs
+        inputs = self.processor(
+            text=labels,
+            images=image,
+            return_tensors="pt",
+            padding=True
+        )
+        # Get predictions
+        with torch.no_grad():
+            outputs = self.model(**inputs)
+            logits_per_image = outputs.logits_per_image
+            probs = logits_per_image.softmax(dim=1)[0]
+        # Filter by threshold and sort
+        detected = []
+        for idx, (label, prob) in enumerate(zip(labels, probs)):
+            confidence = float(prob)
+            if confidence >= threshold:
+                detected.append({
+                    "name": label,
+                    "confidence": round(confidence, 4),
+                    "index": idx,
+                })
+        # Sort by confidence
+        detected.sort(key=lambda x: x["confidence"], reverse=True)
+        return detected
+    def analyze(
+        self,
+        media: Any,
+        media_path: Path
+    ) -> Dict[str, Any]:
+        """
+        Detect objects in the image.
+        Args:
+            media: PIL Image or numpy array
+            media_path: Path to image file
+        Returns:
+            Dictionary with detected objects
+        """
+        try:
+            # Check if initialized
+            if not self._initialized:
+                self.initialize()
+            # Validate input
+            if not self.validate_input(media):
+                return {"error": "Invalid input type"}
+            # Convert to PIL Image if numpy array
+            if isinstance(media, np.ndarray):
+                image = Image.fromarray(
+                    (media * 255).astype(np.uint8) if media.max() <= 1
+                    else media.astype(np.uint8)
+                )
+            else:
+                image = media
+            # Detect objects
+            objects = self._detect_objects(
+                image,
+                self.candidate_labels,
+                threshold=0.15
+            )
+            # Get top objects
+            top_objects = objects[:10]
+            # Categorize objects
+            categories = self._categorize_objects(top_objects)
+            result = {
+                "objects": top_objects,
+                "total_detected": len(objects),
+                "categories": categories,
+                "candidate_labels_count": len(self.candidate_labels),
+                "status": "success",
+            }
+            logger.debug(
+                f"Object detection complete: {len(top_objects)} objects found"
+            )
+            return result
+        except Exception as e:
+            logger.error(f"Object detection failed: {e}")
+            return {
+                "error": str(e),
+                "status": "failed"
+            }
+    def _categorize_objects(
+        self,
+        objects: List[Dict[str, Any]]
+    ) -> Dict[str, List[str]]:
+        """
+        Categorize detected objects.
+        Args:
+            objects: List of detected objects
+        Returns:
+            Dictionary of categories
+        """
+        categories = {
+            "people": [],
+            "animals": [],
+            "vehicles": [],
+            "nature": [],
+            "objects": [],
+            "places": [],
+        }
+        for obj in objects:
+            name = obj["name"]
+            if name in ["person", "people", "man", "woman", "child", "baby"]:
+                categories["people"].append(name)
+            elif name in ["dog", "cat", "bird", "animal"]:
+                categories["animals"].append(name)
+            elif name in ["car", "vehicle", "bicycle", "motorcycle"]:
+                categories["vehicles"].append(name)
+            elif name in ["tree", "plant", "flower", "nature", "landscape",
+                         "mountain", "ocean", "beach"]:
+                categories["nature"].append(name)
+            elif name in ["indoor", "outdoor", "room", "street", "building",
+                         "house"]:
+                categories["places"].append(name)
+            else:
+                categories["objects"].append(name)
+        # Remove empty categories
+        categories = {k: v for k, v in categories.items() if v}
+        return categories
+    def cleanup(self) -> None:
+        """Clean up model resources."""
+        if self.model is not None:
+            del self.model
+            self.model = None
+        if self.processor is not None:
+            del self.processor
+            self.processor = None
+        logger.info("ObjectDetectorPlugin cleanup complete")

requirements.txt ADDED Viewed

	@@ -0,0 +1,11 @@

+gradio==4.44.0
+numpy>=1.24.0
+pillow>=10.0.0
+opencv-python>=4.8.0
+loguru>=0.7.0
+# Optional: For ML plugins (Object Detector & Caption Generator)
+# Uncomment these lines to enable heavy ML features
+# transformers>=4.30.0
+# torch>=2.0.0
+# torchvision>=0.15.0