Spaces:

tokgae
/

cora

Sleeping

File size: 1,487 Bytes

38ab39c

# Architecture: Cora

## System Overview

```mermaid

graph TD

    User([User]) --> UI[Gradio UI]

    User --> API[FastAPI Persistence]

    UI --> Engine[CoraEngine]

    API --> Engine

    Engine --> Curator[CoraCurator - LLM]

    Engine --> Generator[HF Inference API - SDXL]

    Engine --> Vision[CoraVision - CLIP/YOLO]

    Engine --> Memory[CoraMemory - ChromaDB]

    Memory --> DB[(ChromaDB SQL)]

    Vision --> Embed[SentenceTransformers]

```

## Core Components

- **CoraEngine (`cora_engine.py`)**: The central orchestrator. Handles generation logic, model fallbacks, and RAG (Retrieval-Augmented Generation) fallback.

- **CoraCurator (`cora_curator.py`)**: Uses a Large Language Model to refine simple user prompts into rich, historical descriptions.
- **CoraVision (`cora_vision.py`)**: Handles visual processing, including generating CLIP embeddings for images/text and performing object detection with YOLO for automated tagging.

- **CoraMemory (`cora_memory.py`)**: Persistent storage using ChromaDB. Stores image paths, metadata, and embeddings for semantic search.

## Data Flow (Generation)

1. User enters prompt.
2. `CoraCurator` refines the prompt.
3. `CoraEngine` calls Hugging Face Inference API.
4. If primary model fails (e.g., 401, 503, or invalid data), it retries with a fallback model.
5. If both fail, it performs a RAG search in `CoraMemory` for the closest existing match.
6. The result is returned to the user and archived.