Spaces:

axrzce
/

Comp-I

Running

App Files Files Community

Comp-I / README.md

axrzce

Deploy from GitHub main

8fd6cc4 verified 4 months ago

preview code

raw

history blame contribute delete

13.8 kB

	---
	# Space metadata for Hugging Face
	# This tells the Space which SDK and entry file to run
	# Safe to keep at top of README; ignored by GitHub rendering
	# (Hugging Face parses the YAML front‑matter)

	title: CompI — Final Dashboard
	emoji: 🎨
	colorFrom: indigo
	colorTo: purple
	sdk: streamlit
	app_file: src/ui/compi_phase3_final_dashboard.py
	pinned: false
	---

	# CompI - Compositional Intelligence Project

	A multi-modal AI system that generates creative content by combining text, images, audio, and emotional context.

	Note: All documentation has been consolidated under docs/. See docs/README.md for an index of guides.

	## 🚀 Project Overview

	CompI (Compositional Intelligence) is designed to create rich, contextually-aware content by:

	- Processing text prompts with emotional analysis
	- Generating images using Stable Diffusion
	- Creating audio compositions
	- Combining multiple modalities for enhanced creative output

	## 📁 Project Structure

	```
	Project CompI/
	├── src/ # Source code
	│ ├── generators/ # Image generation modules
	│ ├── models/ # Model implementations
	│ ├── utils/ # Utility functions
	│ ├── data/ # Data processing
	│ ├── ui/ # User interface components
	│ └── setup_env.py # Environment setup script
	├── notebooks/ # Jupyter notebooks for experimentation
	├── data/ # Dataset storage
	├── outputs/ # Generated content
	├── tests/ # Unit tests
	├── run_*.py # Convenience scripts for generators
	├── requirements.txt # Python dependencies
	└── README.md # This file
	```

	## 🛠️ Setup Instructions

	### 1. Create Virtual Environment

	```bash
	# Using conda (recommended for ML projects)
	conda create -n compi-env python=3.10 -y
	conda activate compi-env

	# OR using venv
	python -m venv compi-env
	# Windows
	compi-env\Scripts\activate
	# Linux/Mac
	source compi-env/bin/activate
	```

	### 2. Install Dependencies

	For GPU users (recommended for faster generation):

	```bash
	# First, check your CUDA version
	nvidia-smi

	# Install PyTorch with CUDA support first (replace cu121 with your CUDA version)
	pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

	# Then install remaining requirements
	pip install -r requirements.txt
	```

	For CPU-only users:

	```bash
	pip install -r requirements.txt
	```

	### 3. Test Installation

	```bash
	python src/test_setup.py
	```

	## 🚀 Quick Start

	### Phase 1: Text-to-Image Generation

	```bash
	# Basic text-to-image generation
	python run_basic_generation.py "A magical forest, digital art"

	# Advanced generation with style conditioning
	python run_advanced_styling.py "dragon in a crystal cave" --style "oil painting" --mood "dramatic"

	# Interactive style selection
	python run_styled_generation.py

	# Quality evaluation and analysis
	python run_evaluation.py

	# Personal style training with LoRA
	python run_lora_training.py --dataset-dir datasets/my_style

	# Generate with personal style
	python run_style_generation.py --lora-path lora_models/my_style/checkpoint-1000 "artwork in my_style"
	```

	### Phase 2.A: Audio-to-Image Generation 🎵

	```bash
	# Install audio processing dependencies
	pip install openai-whisper

	# Streamlit UI (Recommended)
	streamlit run src/ui/compi_phase2a_streamlit_ui.py

	# Command line generation
	python run_phase2a_audio_to_image.py --prompt "mystical forest" --audio "music.mp3"

	# Interactive mode
	python run_phase2a_audio_to_image.py --interactive

	# Test installation
	python src/test_phase2a.py

	# Run examples
	python examples/phase2a_audio_examples.py --example all
	```

	### Phase 2.B: Data/Logic-to-Image Generation 📊

	```bash
	# Streamlit UI (Recommended)
	streamlit run src/ui/compi_phase2b_streamlit_ui.py

	# Command line generation with CSV data
	python run_phase2b_data_to_image.py --prompt "data visualization" --csv "data.csv"

	# Mathematical formula generation
	python run_phase2b_data_to_image.py --prompt "mathematical harmony" --formula "np.sin(np.linspace(0, 4*np.pi, 100))"

	# Batch processing
	python run_phase2b_data_to_image.py --batch-csv "data_folder/" --prompt "scientific patterns"

	# Interactive mode
	python run_phase2b_data_to_image.py --interactive
	```

	### Phase 2.C: Emotional/Contextual Input to Image Generation 🌀

	```bash
	# Streamlit UI (Recommended)
	streamlit run src/ui/compi_phase2c_streamlit_ui.py

	# Command line generation with preset emotion
	python run_phase2c_emotion_to_image.py --prompt "mystical forest" --emotion "mysterious"

	# Custom emotion generation
	python run_phase2c_emotion_to_image.py --prompt "urban landscape" --emotion "🤩" --type custom

	# Descriptive emotion generation
	python run_phase2c_emotion_to_image.py --prompt "mountain vista" --emotion "I feel a sense of wonder" --type text

	# Batch emotion processing
	python run_phase2c_emotion_to_image.py --batch-emotions "joyful,sad,mysterious" --prompt "abstract art"

	# Interactive mode
	python run_phase2c_emotion_to_image.py --interactive
	```

	### Phase 2.D: Real-Time Data Feeds to Image Generation 🌎

	```bash
	# Streamlit UI (Recommended)
	streamlit run src/ui/compi_phase2d_streamlit_ui.py

	# Command line generation with weather data
	python run_phase2d_realtime_to_image.py --prompt "cityscape" --weather --city "Tokyo"

	# News-driven generation
	python run_phase2d_realtime_to_image.py --prompt "abstract art" --news --category "technology"

	# Multi-source generation
	python run_phase2d_realtime_to_image.py --prompt "world state" --weather --news --financial

	# Temporal series generation
	python run_phase2d_realtime_to_image.py --prompt "evolving world" --weather --temporal "0,30,60"

	# Interactive mode
	python run_phase2d_realtime_to_image.py --interactive
	```

	### Phase 2.E: Style Reference/Example Image to AI Art 🖼️

	```bash
	# Streamlit UI (Recommended)
	streamlit run src/ui/compi_phase2e_streamlit_ui.py

	# Command line generation with reference image
	python run_phase2e_refimg_to_image.py --prompt "magical forest" --reference "path/to/image.jpg" --strength 0.6

	# Web URL reference
	python run_phase2e_refimg_to_image.py --prompt "cyberpunk city" --reference "https://example.com/artwork.jpg"

	# Batch generation with multiple variations
	python run_phase2e_refimg_to_image.py --prompt "fantasy landscape" --reference "image.png" --num-images 3

	# Style analysis only
	python run_phase2e_refimg_to_image.py --analyze-only --reference "artwork.jpg"

	# Interactive mode
	python run_phase2e_refimg_to_image.py --interactive
	```

	## 🧪 NEW: Ultimate Multimodal Dashboard (True Fusion) 🚀

	Revolutionary upgrade with REAL processing of each input type!

	```bash
	# Launch the upgraded dashboard with true multimodal fusion
	python run_ultimate_multimodal_dashboard.py

	# Or run directly
	streamlit run src/ui/compi_ultimate_multimodal_dashboard.py --server.port 8503
	```

	Key Improvements:

	- ✅ Real Audio Analysis: Whisper transcription + librosa features
	- ✅ Actual Data Processing: CSV analysis + formula evaluation
	- ✅ True Emotion Analysis: TextBlob sentiment classification
	- ✅ Live Real-time Data: Weather/news API integration
	- ✅ Advanced References: img2img + ControlNet processing
	- ✅ Intelligent Fusion: Actual content processing (not static keywords)

	Access at: `http://localhost:8503`

	See: `ULTIMATE_MULTIMODAL_DASHBOARD_README.md` for detailed documentation.

	## 🖼️ NEW: Phase 3.C Advanced Reference Integration 🚀

	Professional multi-reference control with hybrid generation modes!

	Key Features:

	- ✅ Role-Based Reference Assignment: Select images for style vs structure
	- ✅ Live ControlNet Previews: Real-time Canny/Depth preprocessing
	- ✅ Hybrid Generation Modes: CN + IMG2IMG simultaneous processing
	- ✅ Professional Controls: Independent strength tuning for style/structure
	- ✅ Seamless Integration: Works with all CompI multimodal phases

	See: `PHASE3C_ADVANCED_REFERENCE_INTEGRATION.md` for complete documentation.

	## 🗂️ NEW: Phase 3.D Professional Workflow Manager 🚀

	Complete creative workflow platform with unified logging, presets, and export bundles!

	Key Features:

	- ✅ Unified Run Logging: Auto-ingests from all CompI phases
	- ✅ Professional Gallery: Advanced filtering and search
	- ✅ Preset System: Save/load complete generation configs
	- ✅ Export Bundles: ZIP packages with metadata and reproducibility
	- ✅ Annotation System: Ratings, tags, and notes for workflow management

	Launch: `python run_phase3d_workflow_manager.py` \| Access: `http://localhost:8504`

	See: `docs/PHASE3D_WORKFLOW_MANAGER_GUIDE.md` for complete documentation.

	## ⚙️ NEW: Phase 3.E Performance, Model Management & Reliability 🚀

	Production-grade performance optimization, model switching, and intelligent reliability!

	Key Features:

	- ✅ Model Manager: Dynamic SD 1.5 ↔ SDXL switching with auto-availability checking
	- ✅ LoRA Integration: Universal LoRA loading with scale control across all models
	- ✅ Performance Controls: xFormers, attention slicing, VAE optimizations, precision control
	- ✅ VRAM Monitoring: Real-time GPU memory usage tracking and alerts
	- ✅ Reliability Engine: OOM-safe auto-retry with intelligent fallbacks
	- ✅ Batch Processing: Seed-controlled batch generation with memory management
	- ✅ Upscaler Integration: Optional 2x latent upscaling for enhanced quality

	Launch: `python run_phase3e_performance_manager.py` \| Access: `http://localhost:8505`

	See: `docs/PHASE3E_PERFORMANCE_GUIDE.md` for complete documentation.

	## 🧪 ULTIMATE: Phase 3 Final Dashboard - Complete Integration! 🎉

	The ultimate CompI interface that integrates ALL Phase 3 components into one unified creative environment!

	Complete Feature Integration:

	- ✅ 🧩 Multimodal Fusion (3.A/3.B): Real audio, data, emotion, real-time processing
	- ✅ 🖼️ Advanced References (3.C): Role assignment, ControlNet, live previews
	- ✅ ⚙️ Performance Management (3.E): Model switching, LoRA, VRAM monitoring
	- ✅ 🎛️ Intelligent Generation: Hybrid modes with automatic fallback strategies
	- ✅ 🖼️ Professional Gallery (3.D): Filtering, rating, annotation system
	- ✅ 💾 Preset Management (3.D): Save/load complete configurations
	- ✅ 📦 Export System (3.D): Complete bundles with metadata and reproducibility

	Professional Workflow:

	1. Configure multimodal inputs (text, audio, data, emotion, real-time)
	2. Upload and assign references (style vs structure roles)
	3. Choose model and optimize performance (SD 1.5/SDXL, LoRA, optimizations)
	4. Generate with intelligent fusion (automatic mode selection)
	5. Review and annotate results (gallery with rating/tagging)
	6. Save presets and export bundles (complete reproducibility)

	Launch: `python run_phase3_final_dashboard.py` \| Access: `http://localhost:8506`

	See: `docs/PHASE3_FINAL_DASHBOARD_GUIDE.md` for complete documentation.

	---

	## 🎯 CompI Project Status: COMPLETE ✅

	CompI has achieved its ultimate vision: the world's most comprehensive and production-ready multimodal AI art generation platform!

	### ✅ All Phases Complete:

	- ✅ Phase 1: Foundation (text-to-image, styling, evaluation, LoRA training)
	- ✅ Phase 2: Multimodal integration (audio, data, emotion, real-time, references)
	- ✅ Phase 3: Advanced features (fusion dashboard, advanced references, workflow management, performance optimization)

	### 🚀 What CompI Offers:

	- Complete Creative Platform: From generation to professional workflow management
	- Production-Grade Reliability: Robust error handling and performance optimization
	- Professional Tools: Industry-standard features for serious creative and commercial work
	- Universal Compatibility: Works across different hardware configurations
	- Extensible Foundation: Ready for future enhancements and integrations

	CompI is now the ultimate multimodal AI art generation platform - ready for professional creative work! 🎨✨

	## 🎯 Core Features

	- Text Analysis: Emotion detection and sentiment analysis
	- Image Generation: Stable Diffusion integration with advanced conditioning
	- Audio Processing: Music and sound analysis with Whisper integration
	- Data Processing: CSV analysis and mathematical formula evaluation
	- Emotion Processing: Preset emotions, custom emotions, emoji, and contextual analysis
	- Real-Time Integration: Live weather, news, and financial data feeds
	- Style Reference: Upload/URL image guidance with AI-powered style analysis
	- Multi-modal Fusion: Combining text, audio, data, emotions, real-time feeds, and visual references
	- Pattern Recognition: Automatic detection of trends, correlations, and seasonality
	- Poetic Interpretation: Converting data patterns and emotions into artistic language
	- Color Psychology: Emotion-based color palette generation and conditioning
	- Temporal Awareness: Time-sensitive data processing and evolution tracking

	## 🔧 Tech Stack

	- Deep Learning: PyTorch, Transformers, Diffusers
	- Audio: librosa, soundfile
	- UI: Streamlit/Gradio
	- Data: pandas, numpy
	- Visualization: matplotlib, seaborn

	## 📝 Usage

	Coming soon - basic usage examples and API documentation.

	## 🤝 Contributing

	This is a development project. Feel free to experiment and extend functionality.

	## 📄 License

	MIT License - see LICENSE file for details.

	# Project_CompI