| # Decisions: Cora | |
| ## 1. Unified Entry Point (FastAPI + Gradio) | |
| - **Problem**: Running a separate API and UI process in Hugging Face Spaces is complex. | |
| - **Solution**: Use `gr.mount_gradio_app` to serve the Gradio UI from a FastAPI instance in `app.py`. | |
| - **Why**: Allows both programmatic API access and a user-friendly interface in a single Docker container. | |
| ## 2. Model Fallback Chain | |
| - **Problem**: Free tier Hugging Face Inference API can be flaky or return invalid data stream errors under load. | |
| - **Solution**: Implemented a "Ghost" logic chain: `SDXL-Lightning` -> `Stable Diffusion 2.1` -> `RAG Fallback` (Archive Search). | |
| - **Why**: Maximizes uptime and ensures the user always gets a relevant image, even if generation fails. | |
| ## 3. ChromaDB for Semantic Search | |
| - **Decision**: Use ChromaDB as the vector store. | |
| - **Why**: Lightweight, persistent, and easy to integrate with CLIP embeddings for semantic retrieval. | |
| ## 4. Environment Variable Fallback | |
| - **Decision**: Support both `HF_API_TOKEN` (legacy/local) and `HF_TOKEN` (standard Hugging Face secret). | |
| - **Why**: To ensure seamless deployment on Hugging Face Spaces where `HF_TOKEN` is the default secret name provided to the environment. | |