Decisions: Cora
1. Unified Entry Point (FastAPI + Gradio)
- Problem: Running a separate API and UI process in Hugging Face Spaces is complex.
- Solution: Use
gr.mount_gradio_appto serve the Gradio UI from a FastAPI instance inapp.py. - Why: Allows both programmatic API access and a user-friendly interface in a single Docker container.
2. Model Fallback Chain
- Problem: Free tier Hugging Face Inference API can be flaky or return invalid data stream errors under load.
- Solution: Implemented a "Ghost" logic chain:
SDXL-Lightning->Stable Diffusion 2.1->RAG Fallback(Archive Search). - Why: Maximizes uptime and ensures the user always gets a relevant image, even if generation fails.
3. ChromaDB for Semantic Search
- Decision: Use ChromaDB as the vector store.
- Why: Lightweight, persistent, and easy to integrate with CLIP embeddings for semantic retrieval.
4. Environment Variable Fallback
- Decision: Support both
HF_API_TOKEN(legacy/local) andHF_TOKEN(standard Hugging Face secret). - Why: To ensure seamless deployment on Hugging Face Spaces where
HF_TOKENis the default secret name provided to the environment.