Spaces:

tokgae
/

cora

Sleeping

cora / docs /memory /DECISIONS.md

Upload folder using huggingface_hub

38ab39c verified about 2 months ago

1.23 kB

	# Decisions: Cora

	## 1. Unified Entry Point (FastAPI + Gradio)

	- Problem: Running a separate API and UI process in Hugging Face Spaces is complex.
	- Solution: Use `gr.mount_gradio_app` to serve the Gradio UI from a FastAPI instance in `app.py`.
	- Why: Allows both programmatic API access and a user-friendly interface in a single Docker container.

	## 2. Model Fallback Chain

	- Problem: Free tier Hugging Face Inference API can be flaky or return invalid data stream errors under load.
	- Solution: Implemented a "Ghost" logic chain: `SDXL-Lightning` -> `Stable Diffusion 2.1` -> `RAG Fallback` (Archive Search).
	- Why: Maximizes uptime and ensures the user always gets a relevant image, even if generation fails.

	## 3. ChromaDB for Semantic Search

	- Decision: Use ChromaDB as the vector store.
	- Why: Lightweight, persistent, and easy to integrate with CLIP embeddings for semantic retrieval.

	## 4. Environment Variable Fallback

	- Decision: Support both `HF_API_TOKEN` (legacy/local) and `HF_TOKEN` (standard Hugging Face secret).
	- Why: To ensure seamless deployment on Hugging Face Spaces where `HF_TOKEN` is the default secret name provided to the environment.

	# Decisions: Cora

	## 1. Unified Entry Point (FastAPI + Gradio)

	- Problem: Running a separate API and UI process in Hugging Face Spaces is complex.
	- Solution: Use `gr.mount_gradio_app` to serve the Gradio UI from a FastAPI instance in `app.py`.
	- Why: Allows both programmatic API access and a user-friendly interface in a single Docker container.

	## 2. Model Fallback Chain

	- Problem: Free tier Hugging Face Inference API can be flaky or return invalid data stream errors under load.
	- Solution: Implemented a "Ghost" logic chain: `SDXL-Lightning` -> `Stable Diffusion 2.1` -> `RAG Fallback` (Archive Search).
	- Why: Maximizes uptime and ensures the user always gets a relevant image, even if generation fails.

	## 3. ChromaDB for Semantic Search

	- Decision: Use ChromaDB as the vector store.
	- Why: Lightweight, persistent, and easy to integrate with CLIP embeddings for semantic retrieval.

	## 4. Environment Variable Fallback

	- Decision: Support both `HF_API_TOKEN` (legacy/local) and `HF_TOKEN` (standard Hugging Face secret).
	- Why: To ensure seamless deployment on Hugging Face Spaces where `HF_TOKEN` is the default secret name provided to the environment.