File size: 5,275 Bytes
38ab39c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 | # Architecture Overview
## System Design Philosophy
Cora is built on three core principles:
1. **Graceful Degradation**: Never fail completely; always serve a visual result
2. **RAG over Fine-Tuning**: Use museum archives to provide context without costly training
3. **Hybrid Intelligence**: Combine AI generation with curated historical data
---
## Component Architecture
### Layer 1: Interface
- **UI (Gradio)**: `ui.py` - Testing/demo interface
- **Etymology API (FastAPI)**: `etymology_api.py` - Production integration endpoint
### Layer 2: Generation Pipeline
```
CoraCurator → CoraEngine → CoraVision → CoraMemory
(LLM) (SDXL) (CLIP) (ChromaDB)
```
### Layer 3: Data Sources
- **Primary**: Hugging Face Inference API (SDXL-Lightning)
- **Fallback**: Museum Archives (Smithsonian + Met)
---
## Data Flow
### Generation Request Flow
```
1. User Request
↓
2. Curator: Refine prompt with LLM
↓
3. Engine: Attempt SDXL generation
├─ Success → Continue to step 4
└─ 402 Error → RAG Fallback
↓
Search Memory by embedding
↓
Return museum artifact
↓
4. Vision: Generate embedding + tags
↓
5. Memory: Archive for future retrieval
↓
6. Response: Image URL + metadata
```
### Ingestion Flow (Museums)
```
1. Loader (smithsonian_loader.py or met_loader.py)
↓
2. API Query → Download images
↓
3. Vision: Generate embedding + detect tags
↓
4. Memory: Index with metadata
↓
5. Persistent storage in ChromaDB
```
---
## Search Strategy
### Hybrid Search Algorithm
**Input:** Query text (e.g., "roman armor")
**Process:**
1. **Text → Vector**: CLIP text encoder
2. **Keyword Detection**: Extract cultural markers ("roman", "greek", etc.)
3. **Over-Retrieve**: Fetch 3x candidates via semantic search
4. **Filter**: Apply tag constraints (must contain "roman")
5. **Rank**: Return top-k filtered results
**Advantage:** Prevents irrelevant matches (e.g., "roman" in "Roman Catholic art")
---
## Model Details
### CoraCurator (LLM)
- **Model**: `meta-llama/Llama-3.2-3B-Instruct`
- **Purpose**: Prompt refinement
- **System Instruction**: Guide toward "Daily Life" or "Epic Dimension" scenes
- **Context**: Etymology → Visual description
### CoraEngine (Image Gen)
- **Primary Model**: `ByteDance/SDXL-Lightning`
- **Params**: `guidance_scale=0.0`, `steps=4`
- **Style**: Historical Illustration / Strategy Game Art
- **Fallback**: RAG → Museum artifacts
### CoraVision (Embeddings)
- **CLIP Model**: `sentence-transformers/clip-ViT-L-14`
- **Output**: 768-dimensional vectors
- **YOLO**: `yolov8n.pt` for object detection/tagging
### CoraMemory (Vector DB)
- **Database**: ChromaDB (persistent, local)
- **Storage**: `./archive_db`
- **Metadata Schema**:
- `path`: Local file path
- `prompt`: Original search query
- `tags`: Comma-separated (e.g., "roman,armor,met_museum_open_access")
- `timestamp`: ISO format
---
## API Design
### Etymology API Endpoints
#### POST `/api/v1/generate_illustration`
**Purpose**: Single endpoint for full pipeline
**Design Decisions**:
- Returns both `image_url` and `image_base64` (flexibility)
- Includes `source` field ("generated" vs "archive")
- Auto-archives all results for future retrieval
- CORS-enabled for cross-origin integration
#### GET `/api/v1/search_archive`
**Purpose**: Direct access to historical artifacts
**Use Case**: Browse mode in etymology app
#### GET `/health`
**Purpose**: Monitor component status
**Returns**:
```json
{
"status": "healthy",
"components": {
"engine": true,
"curator": true,
"vision": true,
"memory": true
}
}
```
---
## Scaling Considerations
### Current Constraints
- **Single Instance**: No load balancing
- **Local Storage**: ChromaDB in-process
- **API Limits**: HF free tier (402 errors common)
### Future Optimizations
1. **Archive Curator (Priority)**: Intelligent system to manage and curate the museum archive
- **Auto-Tagging**: Enhance metadata with historical period, culture, object type
- **Quality Scoring**: Rate artifact relevance for different etymology contexts
- **Deduplication**: Detect and merge similar artifacts
- **Smart Indexing**: Organize by historical timeline, geography, theme
- **Active Curation**: Suggest best artifacts for specific words/contexts
- **Gap Analysis**: Identify missing periods/cultures and trigger targeted ingestion
2. **Caching**: Hash etymology text → serve cached images
3. **Queue System**: Celery for async generation
4. **CDN**: Serve `archive_images/` via CloudFront/similar
5. **Model Hosting**: Self-host SDXL on GPU server to avoid 402 errors
---
## Security Notes
### API Keys
- Stored in `.env` (gitignored)
- Never exposed in responses or logs
### CORS
- Currently set to `allow_origins=["*"]` for development
- **Production**: Restrict to etymology app domain
### Static Files
- `archive_images/` served directly via FastAPI
- No authentication (museum artifacts are public domain)
- Consider rate limiting for public deployments
|