Architecture: Cora
System Overview
graph TD
User([User]) --> UI[Gradio UI]
User --> API[FastAPI Persistence]
UI --> Engine[CoraEngine]
API --> Engine
Engine --> Curator[CoraCurator - LLM]
Engine --> Generator[HF Inference API - SDXL]
Engine --> Vision[CoraVision - CLIP/YOLO]
Engine --> Memory[CoraMemory - ChromaDB]
Memory --> DB[(ChromaDB SQL)]
Vision --> Embed[SentenceTransformers]
Core Components
- CoraEngine (
cora_engine.py): The central orchestrator. Handles generation logic, model fallbacks, and RAG (Retrieval-Augmented Generation) fallback. - CoraCurator (
cora_curator.py): Uses a Large Language Model to refine simple user prompts into rich, historical descriptions. - CoraVision (
cora_vision.py): Handles visual processing, including generating CLIP embeddings for images/text and performing object detection with YOLO for automated tagging. - CoraMemory (
cora_memory.py): Persistent storage using ChromaDB. Stores image paths, metadata, and embeddings for semantic search.
Data Flow (Generation)
- User enters prompt.
CoraCuratorrefines the prompt.CoraEnginecalls Hugging Face Inference API.- If primary model fails (e.g., 401, 503, or invalid data), it retries with a fallback model.
- If both fail, it performs a RAG search in
CoraMemoryfor the closest existing match. - The result is returned to the user and archived.