File size: 13,797 Bytes
8fd6cc4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
338d95d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
459699b
 
338d95d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
---
# Space metadata for Hugging Face
# This tells the Space which SDK and entry file to run
# Safe to keep at top of README; ignored by GitHub rendering
# (Hugging Face parses the YAML front‑matter)

title: CompI β€” Final Dashboard
emoji: 🎨
colorFrom: indigo
colorTo: purple
sdk: streamlit
app_file: src/ui/compi_phase3_final_dashboard.py
pinned: false
---

# CompI - Compositional Intelligence Project

A multi-modal AI system that generates creative content by combining text, images, audio, and emotional context.

Note: All documentation has been consolidated under docs/. See docs/README.md for an index of guides.

## πŸš€ Project Overview

CompI (Compositional Intelligence) is designed to create rich, contextually-aware content by:

- Processing text prompts with emotional analysis
- Generating images using Stable Diffusion
- Creating audio compositions
- Combining multiple modalities for enhanced creative output

## πŸ“ Project Structure

```
Project CompI/
β”œβ”€β”€ src/                    # Source code
β”‚   β”œβ”€β”€ generators/        # Image generation modules
β”‚   β”œβ”€β”€ models/            # Model implementations
β”‚   β”œβ”€β”€ utils/             # Utility functions
β”‚   β”œβ”€β”€ data/              # Data processing
β”‚   β”œβ”€β”€ ui/                # User interface components
β”‚   └── setup_env.py       # Environment setup script
β”œβ”€β”€ notebooks/             # Jupyter notebooks for experimentation
β”œβ”€β”€ data/                  # Dataset storage
β”œβ”€β”€ outputs/               # Generated content
β”œβ”€β”€ tests/                 # Unit tests
β”œβ”€β”€ run_*.py               # Convenience scripts for generators
β”œβ”€β”€ requirements.txt       # Python dependencies
└── README.md             # This file
```

## πŸ› οΈ Setup Instructions

### 1. Create Virtual Environment

```bash
# Using conda (recommended for ML projects)
conda create -n compi-env python=3.10 -y
conda activate compi-env

# OR using venv
python -m venv compi-env
# Windows
compi-env\Scripts\activate
# Linux/Mac
source compi-env/bin/activate
```

### 2. Install Dependencies

**For GPU users (recommended for faster generation):**

```bash
# First, check your CUDA version
nvidia-smi

# Install PyTorch with CUDA support first (replace cu121 with your CUDA version)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

# Then install remaining requirements
pip install -r requirements.txt
```

**For CPU-only users:**

```bash
pip install -r requirements.txt
```

### 3. Test Installation

```bash
python src/test_setup.py
```

## πŸš€ Quick Start

### Phase 1: Text-to-Image Generation

```bash
# Basic text-to-image generation
python run_basic_generation.py "A magical forest, digital art"

# Advanced generation with style conditioning
python run_advanced_styling.py "dragon in a crystal cave" --style "oil painting" --mood "dramatic"

# Interactive style selection
python run_styled_generation.py

# Quality evaluation and analysis
python run_evaluation.py

# Personal style training with LoRA
python run_lora_training.py --dataset-dir datasets/my_style

# Generate with personal style
python run_style_generation.py --lora-path lora_models/my_style/checkpoint-1000 "artwork in my_style"
```

### Phase 2.A: Audio-to-Image Generation 🎡

```bash
# Install audio processing dependencies
pip install openai-whisper

# Streamlit UI (Recommended)
streamlit run src/ui/compi_phase2a_streamlit_ui.py

# Command line generation
python run_phase2a_audio_to_image.py --prompt "mystical forest" --audio "music.mp3"

# Interactive mode
python run_phase2a_audio_to_image.py --interactive

# Test installation
python src/test_phase2a.py

# Run examples
python examples/phase2a_audio_examples.py --example all
```

### Phase 2.B: Data/Logic-to-Image Generation πŸ“Š

```bash
# Streamlit UI (Recommended)
streamlit run src/ui/compi_phase2b_streamlit_ui.py

# Command line generation with CSV data
python run_phase2b_data_to_image.py --prompt "data visualization" --csv "data.csv"

# Mathematical formula generation
python run_phase2b_data_to_image.py --prompt "mathematical harmony" --formula "np.sin(np.linspace(0, 4*np.pi, 100))"

# Batch processing
python run_phase2b_data_to_image.py --batch-csv "data_folder/" --prompt "scientific patterns"

# Interactive mode
python run_phase2b_data_to_image.py --interactive
```

### Phase 2.C: Emotional/Contextual Input to Image Generation πŸŒ€

```bash
# Streamlit UI (Recommended)
streamlit run src/ui/compi_phase2c_streamlit_ui.py

# Command line generation with preset emotion
python run_phase2c_emotion_to_image.py --prompt "mystical forest" --emotion "mysterious"

# Custom emotion generation
python run_phase2c_emotion_to_image.py --prompt "urban landscape" --emotion "🀩" --type custom

# Descriptive emotion generation
python run_phase2c_emotion_to_image.py --prompt "mountain vista" --emotion "I feel a sense of wonder" --type text

# Batch emotion processing
python run_phase2c_emotion_to_image.py --batch-emotions "joyful,sad,mysterious" --prompt "abstract art"

# Interactive mode
python run_phase2c_emotion_to_image.py --interactive
```

### Phase 2.D: Real-Time Data Feeds to Image Generation 🌎

```bash
# Streamlit UI (Recommended)
streamlit run src/ui/compi_phase2d_streamlit_ui.py

# Command line generation with weather data
python run_phase2d_realtime_to_image.py --prompt "cityscape" --weather --city "Tokyo"

# News-driven generation
python run_phase2d_realtime_to_image.py --prompt "abstract art" --news --category "technology"

# Multi-source generation
python run_phase2d_realtime_to_image.py --prompt "world state" --weather --news --financial

# Temporal series generation
python run_phase2d_realtime_to_image.py --prompt "evolving world" --weather --temporal "0,30,60"

# Interactive mode
python run_phase2d_realtime_to_image.py --interactive
```

### Phase 2.E: Style Reference/Example Image to AI Art πŸ–ΌοΈ

```bash
# Streamlit UI (Recommended)
streamlit run src/ui/compi_phase2e_streamlit_ui.py

# Command line generation with reference image
python run_phase2e_refimg_to_image.py --prompt "magical forest" --reference "path/to/image.jpg" --strength 0.6

# Web URL reference
python run_phase2e_refimg_to_image.py --prompt "cyberpunk city" --reference "https://example.com/artwork.jpg"

# Batch generation with multiple variations
python run_phase2e_refimg_to_image.py --prompt "fantasy landscape" --reference "image.png" --num-images 3

# Style analysis only
python run_phase2e_refimg_to_image.py --analyze-only --reference "artwork.jpg"

# Interactive mode
python run_phase2e_refimg_to_image.py --interactive
```

## πŸ§ͺ NEW: Ultimate Multimodal Dashboard (True Fusion) πŸš€

**Revolutionary upgrade with REAL processing of each input type!**

```bash
# Launch the upgraded dashboard with true multimodal fusion
python run_ultimate_multimodal_dashboard.py

# Or run directly
streamlit run src/ui/compi_ultimate_multimodal_dashboard.py --server.port 8503
```

**Key Improvements:**

- βœ… **Real Audio Analysis**: Whisper transcription + librosa features
- βœ… **Actual Data Processing**: CSV analysis + formula evaluation
- βœ… **True Emotion Analysis**: TextBlob sentiment classification
- βœ… **Live Real-time Data**: Weather/news API integration
- βœ… **Advanced References**: img2img + ControlNet processing
- βœ… **Intelligent Fusion**: Actual content processing (not static keywords)

**Access at:** `http://localhost:8503`

**See:** `ULTIMATE_MULTIMODAL_DASHBOARD_README.md` for detailed documentation.

## πŸ–ΌοΈ NEW: Phase 3.C Advanced Reference Integration πŸš€

**Professional multi-reference control with hybrid generation modes!**

**Key Features:**

- βœ… **Role-Based Reference Assignment**: Select images for style vs structure
- βœ… **Live ControlNet Previews**: Real-time Canny/Depth preprocessing
- βœ… **Hybrid Generation Modes**: CN + IMG2IMG simultaneous processing
- βœ… **Professional Controls**: Independent strength tuning for style/structure
- βœ… **Seamless Integration**: Works with all CompI multimodal phases

**See:** `PHASE3C_ADVANCED_REFERENCE_INTEGRATION.md` for complete documentation.

## πŸ—‚οΈ NEW: Phase 3.D Professional Workflow Manager πŸš€

**Complete creative workflow platform with unified logging, presets, and export bundles!**

**Key Features:**

- βœ… **Unified Run Logging**: Auto-ingests from all CompI phases
- βœ… **Professional Gallery**: Advanced filtering and search
- βœ… **Preset System**: Save/load complete generation configs
- βœ… **Export Bundles**: ZIP packages with metadata and reproducibility
- βœ… **Annotation System**: Ratings, tags, and notes for workflow management

**Launch:** `python run_phase3d_workflow_manager.py` | **Access:** `http://localhost:8504`

**See:** `docs/PHASE3D_WORKFLOW_MANAGER_GUIDE.md` for complete documentation.

## βš™οΈ NEW: Phase 3.E Performance, Model Management & Reliability πŸš€

**Production-grade performance optimization, model switching, and intelligent reliability!**

**Key Features:**

- βœ… **Model Manager**: Dynamic SD 1.5 ↔ SDXL switching with auto-availability checking
- βœ… **LoRA Integration**: Universal LoRA loading with scale control across all models
- βœ… **Performance Controls**: xFormers, attention slicing, VAE optimizations, precision control
- βœ… **VRAM Monitoring**: Real-time GPU memory usage tracking and alerts
- βœ… **Reliability Engine**: OOM-safe auto-retry with intelligent fallbacks
- βœ… **Batch Processing**: Seed-controlled batch generation with memory management
- βœ… **Upscaler Integration**: Optional 2x latent upscaling for enhanced quality

**Launch:** `python run_phase3e_performance_manager.py` | **Access:** `http://localhost:8505`

**See:** `docs/PHASE3E_PERFORMANCE_GUIDE.md` for complete documentation.

## πŸ§ͺ ULTIMATE: Phase 3 Final Dashboard - Complete Integration! πŸŽ‰

**The ultimate CompI interface that integrates ALL Phase 3 components into one unified creative environment!**

**Complete Feature Integration:**

- βœ… **🧩 Multimodal Fusion (3.A/3.B)**: Real audio, data, emotion, real-time processing
- βœ… **πŸ–ΌοΈ Advanced References (3.C)**: Role assignment, ControlNet, live previews
- βœ… **βš™οΈ Performance Management (3.E)**: Model switching, LoRA, VRAM monitoring
- βœ… **πŸŽ›οΈ Intelligent Generation**: Hybrid modes with automatic fallback strategies
- βœ… **πŸ–ΌοΈ Professional Gallery (3.D)**: Filtering, rating, annotation system
- βœ… **πŸ’Ύ Preset Management (3.D)**: Save/load complete configurations
- βœ… **πŸ“¦ Export System (3.D)**: Complete bundles with metadata and reproducibility

**Professional Workflow:**

1. **Configure multimodal inputs** (text, audio, data, emotion, real-time)
2. **Upload and assign references** (style vs structure roles)
3. **Choose model and optimize performance** (SD 1.5/SDXL, LoRA, optimizations)
4. **Generate with intelligent fusion** (automatic mode selection)
5. **Review and annotate results** (gallery with rating/tagging)
6. **Save presets and export bundles** (complete reproducibility)

**Launch:** `python run_phase3_final_dashboard.py` | **Access:** `http://localhost:8506`

**See:** `docs/PHASE3_FINAL_DASHBOARD_GUIDE.md` for complete documentation.

---

## 🎯 **CompI Project Status: COMPLETE** βœ…

**CompI has achieved its ultimate vision: the world's most comprehensive and production-ready multimodal AI art generation platform!**

### **βœ… All Phases Complete:**

- **βœ… Phase 1**: Foundation (text-to-image, styling, evaluation, LoRA training)
- **βœ… Phase 2**: Multimodal integration (audio, data, emotion, real-time, references)
- **βœ… Phase 3**: Advanced features (fusion dashboard, advanced references, workflow management, performance optimization)

### **πŸš€ What CompI Offers:**

- **Complete Creative Platform**: From generation to professional workflow management
- **Production-Grade Reliability**: Robust error handling and performance optimization
- **Professional Tools**: Industry-standard features for serious creative and commercial work
- **Universal Compatibility**: Works across different hardware configurations
- **Extensible Foundation**: Ready for future enhancements and integrations

**CompI is now the ultimate multimodal AI art generation platform - ready for professional creative work!** 🎨✨

## 🎯 Core Features

- **Text Analysis**: Emotion detection and sentiment analysis
- **Image Generation**: Stable Diffusion integration with advanced conditioning
- **Audio Processing**: Music and sound analysis with Whisper integration
- **Data Processing**: CSV analysis and mathematical formula evaluation
- **Emotion Processing**: Preset emotions, custom emotions, emoji, and contextual analysis
- **Real-Time Integration**: Live weather, news, and financial data feeds
- **Style Reference**: Upload/URL image guidance with AI-powered style analysis
- **Multi-modal Fusion**: Combining text, audio, data, emotions, real-time feeds, and visual references
- **Pattern Recognition**: Automatic detection of trends, correlations, and seasonality
- **Poetic Interpretation**: Converting data patterns and emotions into artistic language
- **Color Psychology**: Emotion-based color palette generation and conditioning
- **Temporal Awareness**: Time-sensitive data processing and evolution tracking

## πŸ”§ Tech Stack

- **Deep Learning**: PyTorch, Transformers, Diffusers
- **Audio**: librosa, soundfile
- **UI**: Streamlit/Gradio
- **Data**: pandas, numpy
- **Visualization**: matplotlib, seaborn

## πŸ“ Usage

Coming soon - basic usage examples and API documentation.

## 🀝 Contributing

This is a development project. Feel free to experiment and extend functionality.

## πŸ“„ License

MIT License - see LICENSE file for details.

# Project_CompI