Spaces:

axrzce
/

Comp-I

Running

File size: 13,797 Bytes

---
# Space metadata for Hugging Face
# This tells the Space which SDK and entry file to run
# Safe to keep at top of README; ignored by GitHub rendering
# (Hugging Face parses the YAML front‑matter)

title: CompI — Final Dashboard
emoji: 🎨
colorFrom: indigo
colorTo: purple
sdk: streamlit
app_file: src/ui/compi_phase3_final_dashboard.py
pinned: false
---

# CompI - Compositional Intelligence Project

A multi-modal AI system that generates creative content by combining text, images, audio, and emotional context.

Note: All documentation has been consolidated under docs/. See docs/README.md for an index of guides.

## 🚀 Project Overview

CompI (Compositional Intelligence) is designed to create rich, contextually-aware content by:

- Processing text prompts with emotional analysis
- Generating images using Stable Diffusion
- Creating audio compositions
- Combining multiple modalities for enhanced creative output

## 📁 Project Structure

```
Project CompI/
├── src/                    # Source code
│   ├── generators/        # Image generation modules
│   ├── models/            # Model implementations
│   ├── utils/             # Utility functions
│   ├── data/              # Data processing
│   ├── ui/                # User interface components
│   └── setup_env.py       # Environment setup script
├── notebooks/             # Jupyter notebooks for experimentation
├── data/                  # Dataset storage
├── outputs/               # Generated content
├── tests/                 # Unit tests
├── run_*.py               # Convenience scripts for generators
├── requirements.txt       # Python dependencies
└── README.md             # This file
```

## 🛠️ Setup Instructions

### 1. Create Virtual Environment

```bash
# Using conda (recommended for ML projects)
conda create -n compi-env python=3.10 -y
conda activate compi-env

# OR using venv
python -m venv compi-env
# Windows
compi-env\Scripts\activate
# Linux/Mac
source compi-env/bin/activate
```

### 2. Install Dependencies

**For GPU users (recommended for faster generation):**

```bash
# First, check your CUDA version
nvidia-smi

# Install PyTorch with CUDA support first (replace cu121 with your CUDA version)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

# Then install remaining requirements
pip install -r requirements.txt
```

**For CPU-only users:**

```bash
pip install -r requirements.txt
```

### 3. Test Installation

```bash
python src/test_setup.py
```

## 🚀 Quick Start

### Phase 1: Text-to-Image Generation

```bash
# Basic text-to-image generation
python run_basic_generation.py "A magical forest, digital art"

# Advanced generation with style conditioning
python run_advanced_styling.py "dragon in a crystal cave" --style "oil painting" --mood "dramatic"

# Interactive style selection
python run_styled_generation.py

# Quality evaluation and analysis
python run_evaluation.py

# Personal style training with LoRA
python run_lora_training.py --dataset-dir datasets/my_style

# Generate with personal style
python run_style_generation.py --lora-path lora_models/my_style/checkpoint-1000 "artwork in my_style"
```

### Phase 2.A: Audio-to-Image Generation 🎵

```bash
# Install audio processing dependencies
pip install openai-whisper

# Streamlit UI (Recommended)
streamlit run src/ui/compi_phase2a_streamlit_ui.py

# Command line generation
python run_phase2a_audio_to_image.py --prompt "mystical forest" --audio "music.mp3"

# Interactive mode
python run_phase2a_audio_to_image.py --interactive

# Test installation
python src/test_phase2a.py

# Run examples
python examples/phase2a_audio_examples.py --example all
```

### Phase 2.B: Data/Logic-to-Image Generation 📊

```bash
# Streamlit UI (Recommended)
streamlit run src/ui/compi_phase2b_streamlit_ui.py

# Command line generation with CSV data
python run_phase2b_data_to_image.py --prompt "data visualization" --csv "data.csv"

# Mathematical formula generation
python run_phase2b_data_to_image.py --prompt "mathematical harmony" --formula "np.sin(np.linspace(0, 4*np.pi, 100))"

# Batch processing
python run_phase2b_data_to_image.py --batch-csv "data_folder/" --prompt "scientific patterns"

# Interactive mode
python run_phase2b_data_to_image.py --interactive
```

### Phase 2.C: Emotional/Contextual Input to Image Generation 🌀

```bash
# Streamlit UI (Recommended)
streamlit run src/ui/compi_phase2c_streamlit_ui.py

# Command line generation with preset emotion
python run_phase2c_emotion_to_image.py --prompt "mystical forest" --emotion "mysterious"

# Custom emotion generation
python run_phase2c_emotion_to_image.py --prompt "urban landscape" --emotion "🤩" --type custom

# Descriptive emotion generation
python run_phase2c_emotion_to_image.py --prompt "mountain vista" --emotion "I feel a sense of wonder" --type text

# Batch emotion processing
python run_phase2c_emotion_to_image.py --batch-emotions "joyful,sad,mysterious" --prompt "abstract art"

# Interactive mode
python run_phase2c_emotion_to_image.py --interactive
```

### Phase 2.D: Real-Time Data Feeds to Image Generation 🌎

```bash
# Streamlit UI (Recommended)
streamlit run src/ui/compi_phase2d_streamlit_ui.py

# Command line generation with weather data
python run_phase2d_realtime_to_image.py --prompt "cityscape" --weather --city "Tokyo"

# News-driven generation
python run_phase2d_realtime_to_image.py --prompt "abstract art" --news --category "technology"

# Multi-source generation
python run_phase2d_realtime_to_image.py --prompt "world state" --weather --news --financial

# Temporal series generation
python run_phase2d_realtime_to_image.py --prompt "evolving world" --weather --temporal "0,30,60"

# Interactive mode
python run_phase2d_realtime_to_image.py --interactive
```

### Phase 2.E: Style Reference/Example Image to AI Art 🖼️

```bash
# Streamlit UI (Recommended)
streamlit run src/ui/compi_phase2e_streamlit_ui.py

# Command line generation with reference image
python run_phase2e_refimg_to_image.py --prompt "magical forest" --reference "path/to/image.jpg" --strength 0.6

# Web URL reference
python run_phase2e_refimg_to_image.py --prompt "cyberpunk city" --reference "https://example.com/artwork.jpg"

# Batch generation with multiple variations
python run_phase2e_refimg_to_image.py --prompt "fantasy landscape" --reference "image.png" --num-images 3

# Style analysis only
python run_phase2e_refimg_to_image.py --analyze-only --reference "artwork.jpg"

# Interactive mode
python run_phase2e_refimg_to_image.py --interactive
```

## 🧪 NEW: Ultimate Multimodal Dashboard (True Fusion) 🚀

**Revolutionary upgrade with REAL processing of each input type!**

```bash
# Launch the upgraded dashboard with true multimodal fusion
python run_ultimate_multimodal_dashboard.py

# Or run directly
streamlit run src/ui/compi_ultimate_multimodal_dashboard.py --server.port 8503
```

**Key Improvements:**

- ✅ **Real Audio Analysis**: Whisper transcription + librosa features
- ✅ **Actual Data Processing**: CSV analysis + formula evaluation
- ✅ **True Emotion Analysis**: TextBlob sentiment classification
- ✅ **Live Real-time Data**: Weather/news API integration
- ✅ **Advanced References**: img2img + ControlNet processing
- ✅ **Intelligent Fusion**: Actual content processing (not static keywords)

**Access at:** `http://localhost:8503`

**See:** `ULTIMATE_MULTIMODAL_DASHBOARD_README.md` for detailed documentation.

## 🖼️ NEW: Phase 3.C Advanced Reference Integration 🚀

**Professional multi-reference control with hybrid generation modes!**

**Key Features:**

- ✅ **Role-Based Reference Assignment**: Select images for style vs structure
- ✅ **Live ControlNet Previews**: Real-time Canny/Depth preprocessing
- ✅ **Hybrid Generation Modes**: CN + IMG2IMG simultaneous processing
- ✅ **Professional Controls**: Independent strength tuning for style/structure
- ✅ **Seamless Integration**: Works with all CompI multimodal phases

**See:** `PHASE3C_ADVANCED_REFERENCE_INTEGRATION.md` for complete documentation.

## 🗂️ NEW: Phase 3.D Professional Workflow Manager 🚀

**Complete creative workflow platform with unified logging, presets, and export bundles!**

**Key Features:**

- ✅ **Unified Run Logging**: Auto-ingests from all CompI phases
- ✅ **Professional Gallery**: Advanced filtering and search
- ✅ **Preset System**: Save/load complete generation configs
- ✅ **Export Bundles**: ZIP packages with metadata and reproducibility
- ✅ **Annotation System**: Ratings, tags, and notes for workflow management

**Launch:** `python run_phase3d_workflow_manager.py` | **Access:** `http://localhost:8504`

**See:** `docs/PHASE3D_WORKFLOW_MANAGER_GUIDE.md` for complete documentation.

## ⚙️ NEW: Phase 3.E Performance, Model Management & Reliability 🚀

**Production-grade performance optimization, model switching, and intelligent reliability!**

**Key Features:**

- ✅ **Model Manager**: Dynamic SD 1.5 ↔ SDXL switching with auto-availability checking
- ✅ **LoRA Integration**: Universal LoRA loading with scale control across all models
- ✅ **Performance Controls**: xFormers, attention slicing, VAE optimizations, precision control
- ✅ **VRAM Monitoring**: Real-time GPU memory usage tracking and alerts
- ✅ **Reliability Engine**: OOM-safe auto-retry with intelligent fallbacks
- ✅ **Batch Processing**: Seed-controlled batch generation with memory management
- ✅ **Upscaler Integration**: Optional 2x latent upscaling for enhanced quality

**Launch:** `python run_phase3e_performance_manager.py` | **Access:** `http://localhost:8505`

**See:** `docs/PHASE3E_PERFORMANCE_GUIDE.md` for complete documentation.

## 🧪 ULTIMATE: Phase 3 Final Dashboard - Complete Integration! 🎉

**The ultimate CompI interface that integrates ALL Phase 3 components into one unified creative environment!**

**Complete Feature Integration:**

- ✅ **🧩 Multimodal Fusion (3.A/3.B)**: Real audio, data, emotion, real-time processing
- ✅ **🖼️ Advanced References (3.C)**: Role assignment, ControlNet, live previews
- ✅ **⚙️ Performance Management (3.E)**: Model switching, LoRA, VRAM monitoring
- ✅ **🎛️ Intelligent Generation**: Hybrid modes with automatic fallback strategies
- ✅ **🖼️ Professional Gallery (3.D)**: Filtering, rating, annotation system
- ✅ **💾 Preset Management (3.D)**: Save/load complete configurations
- ✅ **📦 Export System (3.D)**: Complete bundles with metadata and reproducibility

**Professional Workflow:**

1. **Configure multimodal inputs** (text, audio, data, emotion, real-time)
2. **Upload and assign references** (style vs structure roles)
3. **Choose model and optimize performance** (SD 1.5/SDXL, LoRA, optimizations)
4. **Generate with intelligent fusion** (automatic mode selection)
5. **Review and annotate results** (gallery with rating/tagging)
6. **Save presets and export bundles** (complete reproducibility)

**Launch:** `python run_phase3_final_dashboard.py` | **Access:** `http://localhost:8506`

**See:** `docs/PHASE3_FINAL_DASHBOARD_GUIDE.md` for complete documentation.

---

## 🎯 **CompI Project Status: COMPLETE** ✅

**CompI has achieved its ultimate vision: the world's most comprehensive and production-ready multimodal AI art generation platform!**

### **✅ All Phases Complete:**

- **✅ Phase 1**: Foundation (text-to-image, styling, evaluation, LoRA training)
- **✅ Phase 2**: Multimodal integration (audio, data, emotion, real-time, references)
- **✅ Phase 3**: Advanced features (fusion dashboard, advanced references, workflow management, performance optimization)

### **🚀 What CompI Offers:**

- **Complete Creative Platform**: From generation to professional workflow management
- **Production-Grade Reliability**: Robust error handling and performance optimization
- **Professional Tools**: Industry-standard features for serious creative and commercial work
- **Universal Compatibility**: Works across different hardware configurations
- **Extensible Foundation**: Ready for future enhancements and integrations

**CompI is now the ultimate multimodal AI art generation platform - ready for professional creative work!** 🎨✨

## 🎯 Core Features

- **Text Analysis**: Emotion detection and sentiment analysis
- **Image Generation**: Stable Diffusion integration with advanced conditioning
- **Audio Processing**: Music and sound analysis with Whisper integration
- **Data Processing**: CSV analysis and mathematical formula evaluation
- **Emotion Processing**: Preset emotions, custom emotions, emoji, and contextual analysis
- **Real-Time Integration**: Live weather, news, and financial data feeds
- **Style Reference**: Upload/URL image guidance with AI-powered style analysis
- **Multi-modal Fusion**: Combining text, audio, data, emotions, real-time feeds, and visual references
- **Pattern Recognition**: Automatic detection of trends, correlations, and seasonality
- **Poetic Interpretation**: Converting data patterns and emotions into artistic language
- **Color Psychology**: Emotion-based color palette generation and conditioning
- **Temporal Awareness**: Time-sensitive data processing and evolution tracking

## 🔧 Tech Stack

- **Deep Learning**: PyTorch, Transformers, Diffusers
- **Audio**: librosa, soundfile
- **UI**: Streamlit/Gradio
- **Data**: pandas, numpy
- **Visualization**: matplotlib, seaborn

## 📝 Usage

Coming soon - basic usage examples and API documentation.

## 🤝 Contributing

This is a development project. Feel free to experiment and extend functionality.

## 📄 License

MIT License - see LICENSE file for details.

# Project_CompI