cora / docs /memory /ARCHITECTURE.md
tokgae's picture
Upload folder using huggingface_hub
38ab39c verified

Architecture: Cora

System Overview

graph TD
    User([User]) --> UI[Gradio UI]
    User --> API[FastAPI Persistence]
    UI --> Engine[CoraEngine]
    API --> Engine
    Engine --> Curator[CoraCurator - LLM]
    Engine --> Generator[HF Inference API - SDXL]
    Engine --> Vision[CoraVision - CLIP/YOLO]
    Engine --> Memory[CoraMemory - ChromaDB]
    Memory --> DB[(ChromaDB SQL)]
    Vision --> Embed[SentenceTransformers]

Core Components

  • CoraEngine (cora_engine.py): The central orchestrator. Handles generation logic, model fallbacks, and RAG (Retrieval-Augmented Generation) fallback.
  • CoraCurator (cora_curator.py): Uses a Large Language Model to refine simple user prompts into rich, historical descriptions.
  • CoraVision (cora_vision.py): Handles visual processing, including generating CLIP embeddings for images/text and performing object detection with YOLO for automated tagging.
  • CoraMemory (cora_memory.py): Persistent storage using ChromaDB. Stores image paths, metadata, and embeddings for semantic search.

Data Flow (Generation)

  1. User enters prompt.
  2. CoraCurator refines the prompt.
  3. CoraEngine calls Hugging Face Inference API.
  4. If primary model fails (e.g., 401, 503, or invalid data), it retries with a fallback model.
  5. If both fail, it performs a RAG search in CoraMemory for the closest existing match.
  6. The result is returned to the user and archived.