File size: 13,797 Bytes
8fd6cc4 338d95d 459699b 338d95d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 |
---
# Space metadata for Hugging Face
# This tells the Space which SDK and entry file to run
# Safe to keep at top of README; ignored by GitHub rendering
# (Hugging Face parses the YAML frontβmatter)
title: CompI β Final Dashboard
emoji: π¨
colorFrom: indigo
colorTo: purple
sdk: streamlit
app_file: src/ui/compi_phase3_final_dashboard.py
pinned: false
---
# CompI - Compositional Intelligence Project
A multi-modal AI system that generates creative content by combining text, images, audio, and emotional context.
Note: All documentation has been consolidated under docs/. See docs/README.md for an index of guides.
## π Project Overview
CompI (Compositional Intelligence) is designed to create rich, contextually-aware content by:
- Processing text prompts with emotional analysis
- Generating images using Stable Diffusion
- Creating audio compositions
- Combining multiple modalities for enhanced creative output
## π Project Structure
```
Project CompI/
βββ src/ # Source code
β βββ generators/ # Image generation modules
β βββ models/ # Model implementations
β βββ utils/ # Utility functions
β βββ data/ # Data processing
β βββ ui/ # User interface components
β βββ setup_env.py # Environment setup script
βββ notebooks/ # Jupyter notebooks for experimentation
βββ data/ # Dataset storage
βββ outputs/ # Generated content
βββ tests/ # Unit tests
βββ run_*.py # Convenience scripts for generators
βββ requirements.txt # Python dependencies
βββ README.md # This file
```
## π οΈ Setup Instructions
### 1. Create Virtual Environment
```bash
# Using conda (recommended for ML projects)
conda create -n compi-env python=3.10 -y
conda activate compi-env
# OR using venv
python -m venv compi-env
# Windows
compi-env\Scripts\activate
# Linux/Mac
source compi-env/bin/activate
```
### 2. Install Dependencies
**For GPU users (recommended for faster generation):**
```bash
# First, check your CUDA version
nvidia-smi
# Install PyTorch with CUDA support first (replace cu121 with your CUDA version)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
# Then install remaining requirements
pip install -r requirements.txt
```
**For CPU-only users:**
```bash
pip install -r requirements.txt
```
### 3. Test Installation
```bash
python src/test_setup.py
```
## π Quick Start
### Phase 1: Text-to-Image Generation
```bash
# Basic text-to-image generation
python run_basic_generation.py "A magical forest, digital art"
# Advanced generation with style conditioning
python run_advanced_styling.py "dragon in a crystal cave" --style "oil painting" --mood "dramatic"
# Interactive style selection
python run_styled_generation.py
# Quality evaluation and analysis
python run_evaluation.py
# Personal style training with LoRA
python run_lora_training.py --dataset-dir datasets/my_style
# Generate with personal style
python run_style_generation.py --lora-path lora_models/my_style/checkpoint-1000 "artwork in my_style"
```
### Phase 2.A: Audio-to-Image Generation π΅
```bash
# Install audio processing dependencies
pip install openai-whisper
# Streamlit UI (Recommended)
streamlit run src/ui/compi_phase2a_streamlit_ui.py
# Command line generation
python run_phase2a_audio_to_image.py --prompt "mystical forest" --audio "music.mp3"
# Interactive mode
python run_phase2a_audio_to_image.py --interactive
# Test installation
python src/test_phase2a.py
# Run examples
python examples/phase2a_audio_examples.py --example all
```
### Phase 2.B: Data/Logic-to-Image Generation π
```bash
# Streamlit UI (Recommended)
streamlit run src/ui/compi_phase2b_streamlit_ui.py
# Command line generation with CSV data
python run_phase2b_data_to_image.py --prompt "data visualization" --csv "data.csv"
# Mathematical formula generation
python run_phase2b_data_to_image.py --prompt "mathematical harmony" --formula "np.sin(np.linspace(0, 4*np.pi, 100))"
# Batch processing
python run_phase2b_data_to_image.py --batch-csv "data_folder/" --prompt "scientific patterns"
# Interactive mode
python run_phase2b_data_to_image.py --interactive
```
### Phase 2.C: Emotional/Contextual Input to Image Generation π
```bash
# Streamlit UI (Recommended)
streamlit run src/ui/compi_phase2c_streamlit_ui.py
# Command line generation with preset emotion
python run_phase2c_emotion_to_image.py --prompt "mystical forest" --emotion "mysterious"
# Custom emotion generation
python run_phase2c_emotion_to_image.py --prompt "urban landscape" --emotion "π€©" --type custom
# Descriptive emotion generation
python run_phase2c_emotion_to_image.py --prompt "mountain vista" --emotion "I feel a sense of wonder" --type text
# Batch emotion processing
python run_phase2c_emotion_to_image.py --batch-emotions "joyful,sad,mysterious" --prompt "abstract art"
# Interactive mode
python run_phase2c_emotion_to_image.py --interactive
```
### Phase 2.D: Real-Time Data Feeds to Image Generation π
```bash
# Streamlit UI (Recommended)
streamlit run src/ui/compi_phase2d_streamlit_ui.py
# Command line generation with weather data
python run_phase2d_realtime_to_image.py --prompt "cityscape" --weather --city "Tokyo"
# News-driven generation
python run_phase2d_realtime_to_image.py --prompt "abstract art" --news --category "technology"
# Multi-source generation
python run_phase2d_realtime_to_image.py --prompt "world state" --weather --news --financial
# Temporal series generation
python run_phase2d_realtime_to_image.py --prompt "evolving world" --weather --temporal "0,30,60"
# Interactive mode
python run_phase2d_realtime_to_image.py --interactive
```
### Phase 2.E: Style Reference/Example Image to AI Art πΌοΈ
```bash
# Streamlit UI (Recommended)
streamlit run src/ui/compi_phase2e_streamlit_ui.py
# Command line generation with reference image
python run_phase2e_refimg_to_image.py --prompt "magical forest" --reference "path/to/image.jpg" --strength 0.6
# Web URL reference
python run_phase2e_refimg_to_image.py --prompt "cyberpunk city" --reference "https://example.com/artwork.jpg"
# Batch generation with multiple variations
python run_phase2e_refimg_to_image.py --prompt "fantasy landscape" --reference "image.png" --num-images 3
# Style analysis only
python run_phase2e_refimg_to_image.py --analyze-only --reference "artwork.jpg"
# Interactive mode
python run_phase2e_refimg_to_image.py --interactive
```
## π§ͺ NEW: Ultimate Multimodal Dashboard (True Fusion) π
**Revolutionary upgrade with REAL processing of each input type!**
```bash
# Launch the upgraded dashboard with true multimodal fusion
python run_ultimate_multimodal_dashboard.py
# Or run directly
streamlit run src/ui/compi_ultimate_multimodal_dashboard.py --server.port 8503
```
**Key Improvements:**
- β
**Real Audio Analysis**: Whisper transcription + librosa features
- β
**Actual Data Processing**: CSV analysis + formula evaluation
- β
**True Emotion Analysis**: TextBlob sentiment classification
- β
**Live Real-time Data**: Weather/news API integration
- β
**Advanced References**: img2img + ControlNet processing
- β
**Intelligent Fusion**: Actual content processing (not static keywords)
**Access at:** `http://localhost:8503`
**See:** `ULTIMATE_MULTIMODAL_DASHBOARD_README.md` for detailed documentation.
## πΌοΈ NEW: Phase 3.C Advanced Reference Integration π
**Professional multi-reference control with hybrid generation modes!**
**Key Features:**
- β
**Role-Based Reference Assignment**: Select images for style vs structure
- β
**Live ControlNet Previews**: Real-time Canny/Depth preprocessing
- β
**Hybrid Generation Modes**: CN + IMG2IMG simultaneous processing
- β
**Professional Controls**: Independent strength tuning for style/structure
- β
**Seamless Integration**: Works with all CompI multimodal phases
**See:** `PHASE3C_ADVANCED_REFERENCE_INTEGRATION.md` for complete documentation.
## ποΈ NEW: Phase 3.D Professional Workflow Manager π
**Complete creative workflow platform with unified logging, presets, and export bundles!**
**Key Features:**
- β
**Unified Run Logging**: Auto-ingests from all CompI phases
- β
**Professional Gallery**: Advanced filtering and search
- β
**Preset System**: Save/load complete generation configs
- β
**Export Bundles**: ZIP packages with metadata and reproducibility
- β
**Annotation System**: Ratings, tags, and notes for workflow management
**Launch:** `python run_phase3d_workflow_manager.py` | **Access:** `http://localhost:8504`
**See:** `docs/PHASE3D_WORKFLOW_MANAGER_GUIDE.md` for complete documentation.
## βοΈ NEW: Phase 3.E Performance, Model Management & Reliability π
**Production-grade performance optimization, model switching, and intelligent reliability!**
**Key Features:**
- β
**Model Manager**: Dynamic SD 1.5 β SDXL switching with auto-availability checking
- β
**LoRA Integration**: Universal LoRA loading with scale control across all models
- β
**Performance Controls**: xFormers, attention slicing, VAE optimizations, precision control
- β
**VRAM Monitoring**: Real-time GPU memory usage tracking and alerts
- β
**Reliability Engine**: OOM-safe auto-retry with intelligent fallbacks
- β
**Batch Processing**: Seed-controlled batch generation with memory management
- β
**Upscaler Integration**: Optional 2x latent upscaling for enhanced quality
**Launch:** `python run_phase3e_performance_manager.py` | **Access:** `http://localhost:8505`
**See:** `docs/PHASE3E_PERFORMANCE_GUIDE.md` for complete documentation.
## π§ͺ ULTIMATE: Phase 3 Final Dashboard - Complete Integration! π
**The ultimate CompI interface that integrates ALL Phase 3 components into one unified creative environment!**
**Complete Feature Integration:**
- β
**π§© Multimodal Fusion (3.A/3.B)**: Real audio, data, emotion, real-time processing
- β
**πΌοΈ Advanced References (3.C)**: Role assignment, ControlNet, live previews
- β
**βοΈ Performance Management (3.E)**: Model switching, LoRA, VRAM monitoring
- β
**ποΈ Intelligent Generation**: Hybrid modes with automatic fallback strategies
- β
**πΌοΈ Professional Gallery (3.D)**: Filtering, rating, annotation system
- β
**πΎ Preset Management (3.D)**: Save/load complete configurations
- β
**π¦ Export System (3.D)**: Complete bundles with metadata and reproducibility
**Professional Workflow:**
1. **Configure multimodal inputs** (text, audio, data, emotion, real-time)
2. **Upload and assign references** (style vs structure roles)
3. **Choose model and optimize performance** (SD 1.5/SDXL, LoRA, optimizations)
4. **Generate with intelligent fusion** (automatic mode selection)
5. **Review and annotate results** (gallery with rating/tagging)
6. **Save presets and export bundles** (complete reproducibility)
**Launch:** `python run_phase3_final_dashboard.py` | **Access:** `http://localhost:8506`
**See:** `docs/PHASE3_FINAL_DASHBOARD_GUIDE.md` for complete documentation.
---
## π― **CompI Project Status: COMPLETE** β
**CompI has achieved its ultimate vision: the world's most comprehensive and production-ready multimodal AI art generation platform!**
### **β
All Phases Complete:**
- **β
Phase 1**: Foundation (text-to-image, styling, evaluation, LoRA training)
- **β
Phase 2**: Multimodal integration (audio, data, emotion, real-time, references)
- **β
Phase 3**: Advanced features (fusion dashboard, advanced references, workflow management, performance optimization)
### **π What CompI Offers:**
- **Complete Creative Platform**: From generation to professional workflow management
- **Production-Grade Reliability**: Robust error handling and performance optimization
- **Professional Tools**: Industry-standard features for serious creative and commercial work
- **Universal Compatibility**: Works across different hardware configurations
- **Extensible Foundation**: Ready for future enhancements and integrations
**CompI is now the ultimate multimodal AI art generation platform - ready for professional creative work!** π¨β¨
## π― Core Features
- **Text Analysis**: Emotion detection and sentiment analysis
- **Image Generation**: Stable Diffusion integration with advanced conditioning
- **Audio Processing**: Music and sound analysis with Whisper integration
- **Data Processing**: CSV analysis and mathematical formula evaluation
- **Emotion Processing**: Preset emotions, custom emotions, emoji, and contextual analysis
- **Real-Time Integration**: Live weather, news, and financial data feeds
- **Style Reference**: Upload/URL image guidance with AI-powered style analysis
- **Multi-modal Fusion**: Combining text, audio, data, emotions, real-time feeds, and visual references
- **Pattern Recognition**: Automatic detection of trends, correlations, and seasonality
- **Poetic Interpretation**: Converting data patterns and emotions into artistic language
- **Color Psychology**: Emotion-based color palette generation and conditioning
- **Temporal Awareness**: Time-sensitive data processing and evolution tracking
## π§ Tech Stack
- **Deep Learning**: PyTorch, Transformers, Diffusers
- **Audio**: librosa, soundfile
- **UI**: Streamlit/Gradio
- **Data**: pandas, numpy
- **Visualization**: matplotlib, seaborn
## π Usage
Coming soon - basic usage examples and API documentation.
## π€ Contributing
This is a development project. Feel free to experiment and extend functionality.
## π License
MIT License - see LICENSE file for details.
# Project_CompI
|