# Architecture Overview ## System Design Philosophy Cora is built on three core principles: 1. **Graceful Degradation**: Never fail completely; always serve a visual result 2. **RAG over Fine-Tuning**: Use museum archives to provide context without costly training 3. **Hybrid Intelligence**: Combine AI generation with curated historical data --- ## Component Architecture ### Layer 1: Interface - **UI (Gradio)**: `ui.py` - Testing/demo interface - **Etymology API (FastAPI)**: `etymology_api.py` - Production integration endpoint ### Layer 2: Generation Pipeline ``` CoraCurator → CoraEngine → CoraVision → CoraMemory (LLM) (SDXL) (CLIP) (ChromaDB) ``` ### Layer 3: Data Sources - **Primary**: Hugging Face Inference API (SDXL-Lightning) - **Fallback**: Museum Archives (Smithsonian + Met) --- ## Data Flow ### Generation Request Flow ``` 1. User Request ↓ 2. Curator: Refine prompt with LLM ↓ 3. Engine: Attempt SDXL generation ├─ Success → Continue to step 4 └─ 402 Error → RAG Fallback ↓ Search Memory by embedding ↓ Return museum artifact ↓ 4. Vision: Generate embedding + tags ↓ 5. Memory: Archive for future retrieval ↓ 6. Response: Image URL + metadata ``` ### Ingestion Flow (Museums) ``` 1. Loader (smithsonian_loader.py or met_loader.py) ↓ 2. API Query → Download images ↓ 3. Vision: Generate embedding + detect tags ↓ 4. Memory: Index with metadata ↓ 5. Persistent storage in ChromaDB ``` --- ## Search Strategy ### Hybrid Search Algorithm **Input:** Query text (e.g., "roman armor") **Process:** 1. **Text → Vector**: CLIP text encoder 2. **Keyword Detection**: Extract cultural markers ("roman", "greek", etc.) 3. **Over-Retrieve**: Fetch 3x candidates via semantic search 4. **Filter**: Apply tag constraints (must contain "roman") 5. **Rank**: Return top-k filtered results **Advantage:** Prevents irrelevant matches (e.g., "roman" in "Roman Catholic art") --- ## Model Details ### CoraCurator (LLM) - **Model**: `meta-llama/Llama-3.2-3B-Instruct` - **Purpose**: Prompt refinement - **System Instruction**: Guide toward "Daily Life" or "Epic Dimension" scenes - **Context**: Etymology → Visual description ### CoraEngine (Image Gen) - **Primary Model**: `ByteDance/SDXL-Lightning` - **Params**: `guidance_scale=0.0`, `steps=4` - **Style**: Historical Illustration / Strategy Game Art - **Fallback**: RAG → Museum artifacts ### CoraVision (Embeddings) - **CLIP Model**: `sentence-transformers/clip-ViT-L-14` - **Output**: 768-dimensional vectors - **YOLO**: `yolov8n.pt` for object detection/tagging ### CoraMemory (Vector DB) - **Database**: ChromaDB (persistent, local) - **Storage**: `./archive_db` - **Metadata Schema**: - `path`: Local file path - `prompt`: Original search query - `tags`: Comma-separated (e.g., "roman,armor,met_museum_open_access") - `timestamp`: ISO format --- ## API Design ### Etymology API Endpoints #### POST `/api/v1/generate_illustration` **Purpose**: Single endpoint for full pipeline **Design Decisions**: - Returns both `image_url` and `image_base64` (flexibility) - Includes `source` field ("generated" vs "archive") - Auto-archives all results for future retrieval - CORS-enabled for cross-origin integration #### GET `/api/v1/search_archive` **Purpose**: Direct access to historical artifacts **Use Case**: Browse mode in etymology app #### GET `/health` **Purpose**: Monitor component status **Returns**: ```json { "status": "healthy", "components": { "engine": true, "curator": true, "vision": true, "memory": true } } ``` --- ## Scaling Considerations ### Current Constraints - **Single Instance**: No load balancing - **Local Storage**: ChromaDB in-process - **API Limits**: HF free tier (402 errors common) ### Future Optimizations 1. **Archive Curator (Priority)**: Intelligent system to manage and curate the museum archive - **Auto-Tagging**: Enhance metadata with historical period, culture, object type - **Quality Scoring**: Rate artifact relevance for different etymology contexts - **Deduplication**: Detect and merge similar artifacts - **Smart Indexing**: Organize by historical timeline, geography, theme - **Active Curation**: Suggest best artifacts for specific words/contexts - **Gap Analysis**: Identify missing periods/cultures and trigger targeted ingestion 2. **Caching**: Hash etymology text → serve cached images 3. **Queue System**: Celery for async generation 4. **CDN**: Serve `archive_images/` via CloudFront/similar 5. **Model Hosting**: Self-host SDXL on GPU server to avoid 402 errors --- ## Security Notes ### API Keys - Stored in `.env` (gitignored) - Never exposed in responses or logs ### CORS - Currently set to `allow_origins=["*"]` for development - **Production**: Restrict to etymology app domain ### Static Files - `archive_images/` served directly via FastAPI - No authentication (museum artifacts are public domain) - Consider rate limiting for public deployments