| --- |
| title: DSN |
| emoji: π’ |
| colorFrom: indigo |
| colorTo: red |
| sdk: docker |
| pinned: false |
| license: mit |
| short_description: DSN HACKATHON |
| --- |
| |
| Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference |
|
|
| This Space is configured as **`sdk: docker`**. The image builds from `Dockerfile` (CPU-only PyTorch so CUDA wheels donβt OOM the builder). During **`docker build`**, models are **`snapshot_download`**βd into `/models/huggingface` **without loading the full LLM into RAM**; **`SentenceTransformer`** embeds a **stub** or Yelp-derived catalog plus **`data/task_a_reviews_embedded.jsonl`** (review RAG for Task A). See `scripts/docker_build_assets.py`. |
|
|
| Task **A**: persona + product β rating/review via **Gemini API** (default) or optional **local** Qwen, plus **retrieved Yelp review snippets** from the baked JSONL. Task **B**: **local** sentence-transformer retrieval over businesses plus **Gemini** (or local) reranking. |
|
|
| **Secrets (Hugging Face Space):** **`GEMINI_API_KEY`** (or `GOOGLE_API_KEY`) β required for generation when `GENERATION_BACKEND=gemini`. Optional **`HF_TOKEN`** for **Docker build** only (embedder download). Never commit keys in the repo. |
| |
| --- |
| |
| ## DSN Γ BCT LLM Agent Challenge β API package |
| |
| **Deadline:** 24 May 2026 end of day (organiser time). Submit solution paper + repo + container link via the official form. |
| |
| Step-by-step agent narrative (for judges and your paper): **[`AGENT_WORKFLOW.md`](AGENT_WORKFLOW.md)**. |
| |
| ### Deliverables checklist |
| |
| - [ ] Working URL or Docker image for this API (judges use POST endpoints below). |
| - [ ] GitHub (or equivalent) with this repo; do not commit `.env` or Yelp raw JSON. |
| - [ ] Solution paper PDF (4β8 pages): point to `AGENT_WORKFLOW.md` for architecture; add experiments (e.g. RAG on/off, Nigerian prompt on/off), limits, Nigerian English design note. |
| - [ ] Disclosures in paper: base HF models, Yelp-derived data / RAG index, embedding catalog build. |
| |
| ### Endpoints |
| |
| | Method | Path | |
| |--------|------| |
| | GET | `/health`, `/` | |
| | POST | `/user-modeling` (aliases: `/task-1`, `/task_a`) | |
| | POST | `/recommendation` (aliases: `/task-2`, `/task_b`) | |
| |
| ### Request bodies |
| |
| **Task 1:** `{"persona": "<multiline user snapshot; optional line user_id: ...>", "product": "<business facts>", "include_raw": false}` β response includes `rag_snippets_used`. |
|
|
| **Task 2:** `{"persona": "...", "city": null, "state": null, "chat_history": [], "top_k_retrieval": 40, "top_n_final": 10}` |
|
|
| ### Local run (clone this repo) |
|
|
| From the **repository root** (this folder): |
|
|
| ```bash |
| cp env.example .env |
| pip install -r requirements.txt |
| ``` |
|
|
| **Task A review index** (Yelp `review.json` + `business.json`): |
|
|
| ```bash |
| python scripts/build_task_a_review_rag.py \ |
| --review-json path/to/yelp_academic_dataset_review.json \ |
| --business-json path/to/yelp_academic_dataset_business.json \ |
| --output data/task_a_reviews_embedded.jsonl \ |
| --max-rows 12000 |
| ``` |
|
|
| Use the same `TASK_B_LOCAL_EMBEDDING_MODEL` (or `TASK_A_EMBEDDING_MODEL`) at build and runtime. Omit the file only for quick tests (generation runs without RAG). |
|
|
| **Generation:** set `GEMINI_API_KEY` in `.env` (see `env.example`). With `GENERATION_BACKEND=gemini` (default), Task A and Task B use **`GEMINI_MODEL`** (default `gemini-2.0-flash`). Set `GENERATION_BACKEND=local` to use on-device Qwen instead. |
| |
| **Task B** reranking uses Gemini when configured; embeddings stay local (`LOCAL_EMBEDDING_MODEL`). |
|
|
| **Recommendation index** (needs Yelp `business.json` on your machine, e.g. `../yelp_dataset/extracted/` from a parent workspace): |
|
|
| ```bash |
| python scripts/build_business_catalog.py --max-rows 30000 --only-open |
| python scripts/embed_catalog.py --batch-size 64 |
| ``` |
|
|
| Use the same `TASK_B_LOCAL_EMBEDDING_MODEL` for `embed_catalog.py` and at API runtime. |
|
|
| **Start API:** |
|
|
| ```bash |
| uvicorn app.main:app --host 0.0.0.0 --port 8080 |
| # or: PORT=8080 python -m app.main |
| ``` |
|
|
| ### Docker |
|
|
| Build with Hub token available **during build** (anonymous works for public models but hits rate limits): |
|
|
| ```bash |
| docker build -t dcn-llm-agent-challenge \ |
| --build-arg HF_TOKEN="$HF_TOKEN" \ |
| --build-arg HUGGING_FACE_HUB_TOKEN="$HUGGING_FACE_HUB_TOKEN" . |
| docker run --env-file .env -p 7860:7860 dcn-llm-agent-challenge |
| ``` |
|
|
| ```bash |
| export HF_TOKEN=hf_... # optional; must be visible to `docker build`, not only the container |
| docker compose up --build -d |
| ``` |
|
|
| Default compose maps **`7860:7860`**. The image bakes **`/code/data/business_catalog_embedded.jsonl`** and **`/code/data/task_a_reviews_embedded.jsonl`** at build time (or stubs if Yelp JSON is missing). Override with a bind mount, e.g. `./data:/code/data`, if you rebuild those files locally. |
| |
| The Docker image sets **`HF_HUB_OFFLINE=1`** and **`TRANSFORMERS_OFFLINE=1`** so the running container does not call the Hugging Face Hub. During **`docker build`**, **`snapshot_download`** copies model **files** into `/models/huggingface` (and stub JSONL is embedded). Loading weights **into RAM** during build was disabled by default (**`DOCKER_BUILD_SKIP_LLM_WARM=1`**) because HF build VMs often **OOM (exit 137)** when loading Qwen; that RAM would not stay in the final image anyway. |
| |
| At **container start**, **`STARTUP_PREWARM=all`** (default) loads **one shared** embedding model and **one shared** causal LM (`app/shared_models.py`), then Task A RAG + Task B catalog β so **`/task-2`** does not pay a second full Qwen load. Expect **~1β2 minutes** on CPU after deploy while logs show `Loading shared β¦`; then both endpoints stay fast. Disable with **`SKIP_STARTUP_PREWARM=1`** (not recommended on Spaces). |
|
|
| ### Smoke checks |
|
|
| OpenAPI: `http://localhost:7860/docs` when using Docker (port **7860**). Local `uvicorn` defaults to **8080** unless you set `PORT`. |
|
|
| ### Layout |
|
|
| | Path | Role | |
| |------|------| |
| | `app/main.py` | FastAPI routes | |
| | [`AGENT_WORKFLOW.md`](AGENT_WORKFLOW.md) | Agent steps, reproducibility, paper hooks (Nigerian English, fallbacks) | |
| | `app/user_modeling.py`, `app/user_modeling_prompt.py`, `app/task_a_rag.py` | Task 1 local LLM + Yelp review RAG | |
| | `app/recommendation_pipeline.py` | Task 2 retrieval + rerank | |
| | `scripts/build_business_catalog.py` | Yelp β catalog JSONL | |
| | `scripts/embed_catalog.py` | Embed catalog (local sentence-transformers) | |
| | `scripts/build_task_a_review_rag.py` | Yelp reviews (+ businesses) β Task A embedded RAG JSONL | |
| | `scripts/docker_build_assets.py` | Docker build: HF prefetch + catalog + Task A RAG | |
| | `env.example` | Copy to `.env` | |
| | `NOTICES.txt` | Data / cloud disclosures | |
|
|
| Optional: container bind-mount Yelp `review.json` + `business.json` at build time so Docker bakes real Task A/B indexes instead of stubs. |
|
|