|
|
--- |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
title: CompI β Final Dashboard |
|
|
emoji: π¨ |
|
|
colorFrom: indigo |
|
|
colorTo: purple |
|
|
sdk: streamlit |
|
|
app_file: src/ui/compi_phase3_final_dashboard.py |
|
|
pinned: false |
|
|
--- |
|
|
|
|
|
# CompI - Compositional Intelligence Project |
|
|
|
|
|
A multi-modal AI system that generates creative content by combining text, images, audio, and emotional context. |
|
|
|
|
|
Note: All documentation has been consolidated under docs/. See docs/README.md for an index of guides. |
|
|
|
|
|
## π Project Overview |
|
|
|
|
|
CompI (Compositional Intelligence) is designed to create rich, contextually-aware content by: |
|
|
|
|
|
- Processing text prompts with emotional analysis |
|
|
- Generating images using Stable Diffusion |
|
|
- Creating audio compositions |
|
|
- Combining multiple modalities for enhanced creative output |
|
|
|
|
|
## π Project Structure |
|
|
|
|
|
``` |
|
|
Project CompI/ |
|
|
βββ src/ # Source code |
|
|
β βββ generators/ # Image generation modules |
|
|
β βββ models/ # Model implementations |
|
|
β βββ utils/ # Utility functions |
|
|
β βββ data/ # Data processing |
|
|
β βββ ui/ # User interface components |
|
|
β βββ setup_env.py # Environment setup script |
|
|
βββ notebooks/ # Jupyter notebooks for experimentation |
|
|
βββ data/ # Dataset storage |
|
|
βββ outputs/ # Generated content |
|
|
βββ tests/ # Unit tests |
|
|
βββ run_*.py # Convenience scripts for generators |
|
|
βββ requirements.txt # Python dependencies |
|
|
βββ README.md # This file |
|
|
``` |
|
|
|
|
|
## π οΈ Setup Instructions |
|
|
|
|
|
### 1. Create Virtual Environment |
|
|
|
|
|
```bash |
|
|
# Using conda (recommended for ML projects) |
|
|
conda create -n compi-env python=3.10 -y |
|
|
conda activate compi-env |
|
|
|
|
|
# OR using venv |
|
|
python -m venv compi-env |
|
|
# Windows |
|
|
compi-env\Scripts\activate |
|
|
# Linux/Mac |
|
|
source compi-env/bin/activate |
|
|
``` |
|
|
|
|
|
### 2. Install Dependencies |
|
|
|
|
|
**For GPU users (recommended for faster generation):** |
|
|
|
|
|
```bash |
|
|
# First, check your CUDA version |
|
|
nvidia-smi |
|
|
|
|
|
# Install PyTorch with CUDA support first (replace cu121 with your CUDA version) |
|
|
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 |
|
|
|
|
|
# Then install remaining requirements |
|
|
pip install -r requirements.txt |
|
|
``` |
|
|
|
|
|
**For CPU-only users:** |
|
|
|
|
|
```bash |
|
|
pip install -r requirements.txt |
|
|
``` |
|
|
|
|
|
### 3. Test Installation |
|
|
|
|
|
```bash |
|
|
python src/test_setup.py |
|
|
``` |
|
|
|
|
|
## π Quick Start |
|
|
|
|
|
### Phase 1: Text-to-Image Generation |
|
|
|
|
|
```bash |
|
|
# Basic text-to-image generation |
|
|
python run_basic_generation.py "A magical forest, digital art" |
|
|
|
|
|
# Advanced generation with style conditioning |
|
|
python run_advanced_styling.py "dragon in a crystal cave" --style "oil painting" --mood "dramatic" |
|
|
|
|
|
# Interactive style selection |
|
|
python run_styled_generation.py |
|
|
|
|
|
# Quality evaluation and analysis |
|
|
python run_evaluation.py |
|
|
|
|
|
# Personal style training with LoRA |
|
|
python run_lora_training.py --dataset-dir datasets/my_style |
|
|
|
|
|
# Generate with personal style |
|
|
python run_style_generation.py --lora-path lora_models/my_style/checkpoint-1000 "artwork in my_style" |
|
|
``` |
|
|
|
|
|
### Phase 2.A: Audio-to-Image Generation π΅ |
|
|
|
|
|
```bash |
|
|
# Install audio processing dependencies |
|
|
pip install openai-whisper |
|
|
|
|
|
# Streamlit UI (Recommended) |
|
|
streamlit run src/ui/compi_phase2a_streamlit_ui.py |
|
|
|
|
|
# Command line generation |
|
|
python run_phase2a_audio_to_image.py --prompt "mystical forest" --audio "music.mp3" |
|
|
|
|
|
# Interactive mode |
|
|
python run_phase2a_audio_to_image.py --interactive |
|
|
|
|
|
# Test installation |
|
|
python src/test_phase2a.py |
|
|
|
|
|
# Run examples |
|
|
python examples/phase2a_audio_examples.py --example all |
|
|
``` |
|
|
|
|
|
### Phase 2.B: Data/Logic-to-Image Generation π |
|
|
|
|
|
```bash |
|
|
# Streamlit UI (Recommended) |
|
|
streamlit run src/ui/compi_phase2b_streamlit_ui.py |
|
|
|
|
|
# Command line generation with CSV data |
|
|
python run_phase2b_data_to_image.py --prompt "data visualization" --csv "data.csv" |
|
|
|
|
|
# Mathematical formula generation |
|
|
python run_phase2b_data_to_image.py --prompt "mathematical harmony" --formula "np.sin(np.linspace(0, 4*np.pi, 100))" |
|
|
|
|
|
# Batch processing |
|
|
python run_phase2b_data_to_image.py --batch-csv "data_folder/" --prompt "scientific patterns" |
|
|
|
|
|
# Interactive mode |
|
|
python run_phase2b_data_to_image.py --interactive |
|
|
``` |
|
|
|
|
|
### Phase 2.C: Emotional/Contextual Input to Image Generation π |
|
|
|
|
|
```bash |
|
|
# Streamlit UI (Recommended) |
|
|
streamlit run src/ui/compi_phase2c_streamlit_ui.py |
|
|
|
|
|
# Command line generation with preset emotion |
|
|
python run_phase2c_emotion_to_image.py --prompt "mystical forest" --emotion "mysterious" |
|
|
|
|
|
# Custom emotion generation |
|
|
python run_phase2c_emotion_to_image.py --prompt "urban landscape" --emotion "π€©" --type custom |
|
|
|
|
|
# Descriptive emotion generation |
|
|
python run_phase2c_emotion_to_image.py --prompt "mountain vista" --emotion "I feel a sense of wonder" --type text |
|
|
|
|
|
# Batch emotion processing |
|
|
python run_phase2c_emotion_to_image.py --batch-emotions "joyful,sad,mysterious" --prompt "abstract art" |
|
|
|
|
|
# Interactive mode |
|
|
python run_phase2c_emotion_to_image.py --interactive |
|
|
``` |
|
|
|
|
|
### Phase 2.D: Real-Time Data Feeds to Image Generation π |
|
|
|
|
|
```bash |
|
|
# Streamlit UI (Recommended) |
|
|
streamlit run src/ui/compi_phase2d_streamlit_ui.py |
|
|
|
|
|
# Command line generation with weather data |
|
|
python run_phase2d_realtime_to_image.py --prompt "cityscape" --weather --city "Tokyo" |
|
|
|
|
|
# News-driven generation |
|
|
python run_phase2d_realtime_to_image.py --prompt "abstract art" --news --category "technology" |
|
|
|
|
|
# Multi-source generation |
|
|
python run_phase2d_realtime_to_image.py --prompt "world state" --weather --news --financial |
|
|
|
|
|
# Temporal series generation |
|
|
python run_phase2d_realtime_to_image.py --prompt "evolving world" --weather --temporal "0,30,60" |
|
|
|
|
|
# Interactive mode |
|
|
python run_phase2d_realtime_to_image.py --interactive |
|
|
``` |
|
|
|
|
|
### Phase 2.E: Style Reference/Example Image to AI Art πΌοΈ |
|
|
|
|
|
```bash |
|
|
# Streamlit UI (Recommended) |
|
|
streamlit run src/ui/compi_phase2e_streamlit_ui.py |
|
|
|
|
|
# Command line generation with reference image |
|
|
python run_phase2e_refimg_to_image.py --prompt "magical forest" --reference "path/to/image.jpg" --strength 0.6 |
|
|
|
|
|
# Web URL reference |
|
|
python run_phase2e_refimg_to_image.py --prompt "cyberpunk city" --reference "https://example.com/artwork.jpg" |
|
|
|
|
|
# Batch generation with multiple variations |
|
|
python run_phase2e_refimg_to_image.py --prompt "fantasy landscape" --reference "image.png" --num-images 3 |
|
|
|
|
|
# Style analysis only |
|
|
python run_phase2e_refimg_to_image.py --analyze-only --reference "artwork.jpg" |
|
|
|
|
|
# Interactive mode |
|
|
python run_phase2e_refimg_to_image.py --interactive |
|
|
``` |
|
|
|
|
|
## π§ͺ NEW: Ultimate Multimodal Dashboard (True Fusion) π |
|
|
|
|
|
**Revolutionary upgrade with REAL processing of each input type!** |
|
|
|
|
|
```bash |
|
|
# Launch the upgraded dashboard with true multimodal fusion |
|
|
python run_ultimate_multimodal_dashboard.py |
|
|
|
|
|
# Or run directly |
|
|
streamlit run src/ui/compi_ultimate_multimodal_dashboard.py --server.port 8503 |
|
|
``` |
|
|
|
|
|
**Key Improvements:** |
|
|
|
|
|
- β
**Real Audio Analysis**: Whisper transcription + librosa features |
|
|
- β
**Actual Data Processing**: CSV analysis + formula evaluation |
|
|
- β
**True Emotion Analysis**: TextBlob sentiment classification |
|
|
- β
**Live Real-time Data**: Weather/news API integration |
|
|
- β
**Advanced References**: img2img + ControlNet processing |
|
|
- β
**Intelligent Fusion**: Actual content processing (not static keywords) |
|
|
|
|
|
**Access at:** `http://localhost:8503` |
|
|
|
|
|
**See:** `ULTIMATE_MULTIMODAL_DASHBOARD_README.md` for detailed documentation. |
|
|
|
|
|
## πΌοΈ NEW: Phase 3.C Advanced Reference Integration π |
|
|
|
|
|
**Professional multi-reference control with hybrid generation modes!** |
|
|
|
|
|
**Key Features:** |
|
|
|
|
|
- β
**Role-Based Reference Assignment**: Select images for style vs structure |
|
|
- β
**Live ControlNet Previews**: Real-time Canny/Depth preprocessing |
|
|
- β
**Hybrid Generation Modes**: CN + IMG2IMG simultaneous processing |
|
|
- β
**Professional Controls**: Independent strength tuning for style/structure |
|
|
- β
**Seamless Integration**: Works with all CompI multimodal phases |
|
|
|
|
|
**See:** `PHASE3C_ADVANCED_REFERENCE_INTEGRATION.md` for complete documentation. |
|
|
|
|
|
## ποΈ NEW: Phase 3.D Professional Workflow Manager π |
|
|
|
|
|
**Complete creative workflow platform with unified logging, presets, and export bundles!** |
|
|
|
|
|
**Key Features:** |
|
|
|
|
|
- β
**Unified Run Logging**: Auto-ingests from all CompI phases |
|
|
- β
**Professional Gallery**: Advanced filtering and search |
|
|
- β
**Preset System**: Save/load complete generation configs |
|
|
- β
**Export Bundles**: ZIP packages with metadata and reproducibility |
|
|
- β
**Annotation System**: Ratings, tags, and notes for workflow management |
|
|
|
|
|
**Launch:** `python run_phase3d_workflow_manager.py` | **Access:** `http://localhost:8504` |
|
|
|
|
|
**See:** `docs/PHASE3D_WORKFLOW_MANAGER_GUIDE.md` for complete documentation. |
|
|
|
|
|
## βοΈ NEW: Phase 3.E Performance, Model Management & Reliability π |
|
|
|
|
|
**Production-grade performance optimization, model switching, and intelligent reliability!** |
|
|
|
|
|
**Key Features:** |
|
|
|
|
|
- β
**Model Manager**: Dynamic SD 1.5 β SDXL switching with auto-availability checking |
|
|
- β
**LoRA Integration**: Universal LoRA loading with scale control across all models |
|
|
- β
**Performance Controls**: xFormers, attention slicing, VAE optimizations, precision control |
|
|
- β
**VRAM Monitoring**: Real-time GPU memory usage tracking and alerts |
|
|
- β
**Reliability Engine**: OOM-safe auto-retry with intelligent fallbacks |
|
|
- β
**Batch Processing**: Seed-controlled batch generation with memory management |
|
|
- β
**Upscaler Integration**: Optional 2x latent upscaling for enhanced quality |
|
|
|
|
|
**Launch:** `python run_phase3e_performance_manager.py` | **Access:** `http://localhost:8505` |
|
|
|
|
|
**See:** `docs/PHASE3E_PERFORMANCE_GUIDE.md` for complete documentation. |
|
|
|
|
|
## π§ͺ ULTIMATE: Phase 3 Final Dashboard - Complete Integration! π |
|
|
|
|
|
**The ultimate CompI interface that integrates ALL Phase 3 components into one unified creative environment!** |
|
|
|
|
|
**Complete Feature Integration:** |
|
|
|
|
|
- β
**π§© Multimodal Fusion (3.A/3.B)**: Real audio, data, emotion, real-time processing |
|
|
- β
**πΌοΈ Advanced References (3.C)**: Role assignment, ControlNet, live previews |
|
|
- β
**βοΈ Performance Management (3.E)**: Model switching, LoRA, VRAM monitoring |
|
|
- β
**ποΈ Intelligent Generation**: Hybrid modes with automatic fallback strategies |
|
|
- β
**πΌοΈ Professional Gallery (3.D)**: Filtering, rating, annotation system |
|
|
- β
**πΎ Preset Management (3.D)**: Save/load complete configurations |
|
|
- β
**π¦ Export System (3.D)**: Complete bundles with metadata and reproducibility |
|
|
|
|
|
**Professional Workflow:** |
|
|
|
|
|
1. **Configure multimodal inputs** (text, audio, data, emotion, real-time) |
|
|
2. **Upload and assign references** (style vs structure roles) |
|
|
3. **Choose model and optimize performance** (SD 1.5/SDXL, LoRA, optimizations) |
|
|
4. **Generate with intelligent fusion** (automatic mode selection) |
|
|
5. **Review and annotate results** (gallery with rating/tagging) |
|
|
6. **Save presets and export bundles** (complete reproducibility) |
|
|
|
|
|
**Launch:** `python run_phase3_final_dashboard.py` | **Access:** `http://localhost:8506` |
|
|
|
|
|
**See:** `docs/PHASE3_FINAL_DASHBOARD_GUIDE.md` for complete documentation. |
|
|
|
|
|
--- |
|
|
|
|
|
## π― **CompI Project Status: COMPLETE** β
|
|
|
|
|
|
**CompI has achieved its ultimate vision: the world's most comprehensive and production-ready multimodal AI art generation platform!** |
|
|
|
|
|
### **β
All Phases Complete:** |
|
|
|
|
|
- **β
Phase 1**: Foundation (text-to-image, styling, evaluation, LoRA training) |
|
|
- **β
Phase 2**: Multimodal integration (audio, data, emotion, real-time, references) |
|
|
- **β
Phase 3**: Advanced features (fusion dashboard, advanced references, workflow management, performance optimization) |
|
|
|
|
|
### **π What CompI Offers:** |
|
|
|
|
|
- **Complete Creative Platform**: From generation to professional workflow management |
|
|
- **Production-Grade Reliability**: Robust error handling and performance optimization |
|
|
- **Professional Tools**: Industry-standard features for serious creative and commercial work |
|
|
- **Universal Compatibility**: Works across different hardware configurations |
|
|
- **Extensible Foundation**: Ready for future enhancements and integrations |
|
|
|
|
|
**CompI is now the ultimate multimodal AI art generation platform - ready for professional creative work!** π¨β¨ |
|
|
|
|
|
## π― Core Features |
|
|
|
|
|
- **Text Analysis**: Emotion detection and sentiment analysis |
|
|
- **Image Generation**: Stable Diffusion integration with advanced conditioning |
|
|
- **Audio Processing**: Music and sound analysis with Whisper integration |
|
|
- **Data Processing**: CSV analysis and mathematical formula evaluation |
|
|
- **Emotion Processing**: Preset emotions, custom emotions, emoji, and contextual analysis |
|
|
- **Real-Time Integration**: Live weather, news, and financial data feeds |
|
|
- **Style Reference**: Upload/URL image guidance with AI-powered style analysis |
|
|
- **Multi-modal Fusion**: Combining text, audio, data, emotions, real-time feeds, and visual references |
|
|
- **Pattern Recognition**: Automatic detection of trends, correlations, and seasonality |
|
|
- **Poetic Interpretation**: Converting data patterns and emotions into artistic language |
|
|
- **Color Psychology**: Emotion-based color palette generation and conditioning |
|
|
- **Temporal Awareness**: Time-sensitive data processing and evolution tracking |
|
|
|
|
|
## π§ Tech Stack |
|
|
|
|
|
- **Deep Learning**: PyTorch, Transformers, Diffusers |
|
|
- **Audio**: librosa, soundfile |
|
|
- **UI**: Streamlit/Gradio |
|
|
- **Data**: pandas, numpy |
|
|
- **Visualization**: matplotlib, seaborn |
|
|
|
|
|
## π Usage |
|
|
|
|
|
Coming soon - basic usage examples and API documentation. |
|
|
|
|
|
## π€ Contributing |
|
|
|
|
|
This is a development project. Feel free to experiment and extend functionality. |
|
|
|
|
|
## π License |
|
|
|
|
|
MIT License - see LICENSE file for details. |
|
|
|
|
|
# Project_CompI |
|
|
|