File size: 6,507 Bytes
6786bbd 73c205a 1c181b2 73c205a d47b370 c78886f 652302c 73c205a 1c181b2 73c205a 1c181b2 73c205a 1c181b2 73c205a 1c181b2 d47b370 652302c d47b370 73c205a 1c181b2 73c205a 1c181b2 73c205a be705e8 c78886f 73c205a be705e8 c78886f 73c205a be705e8 c78886f 73c205a 1c181b2 c78886f 0bf3001 10bc91f d47b370 10bc91f 73c205a c78886f 73c205a 1c181b2 d47b370 73c205a 1c181b2 73c205a 1c181b2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 | ---
title: DSN
emoji: 🏢
colorFrom: indigo
colorTo: red
sdk: docker
pinned: false
license: mit
short_description: DSN HACKATHON
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
This Space is configured as **`sdk: docker`**. The image builds from `Dockerfile` (CPU-only PyTorch so CUDA wheels don’t OOM the builder). During **`docker build`**, models are **`snapshot_download`**’d into `/models/huggingface` **without loading the full LLM into RAM**; **`SentenceTransformer`** embeds a **stub** or Yelp-derived catalog plus **`data/task_a_reviews_embedded.jsonl`** (review RAG for Task A). See `scripts/docker_build_assets.py`.
Task **A**: persona + product → rating/review via **Gemini API** and retrieved Yelp review snippets from the baked JSONL. Task **B**: local sentence-transformer retrieval over businesses plus **Gemini** reranking.
**Secrets (Hugging Face Space):** **`GEMINI_API_KEY`** (or `GOOGLE_API_KEY`) — required for generation when `GENERATION_BACKEND=gemini`. Optional **`HF_TOKEN`** for **Docker build** only (embedder download). Never commit keys in the repo.
---
## DSN × BCT LLM Agent Challenge — API package
**Deadline:** 24 May 2026 end of day (organiser time). Submit solution paper + repo + container link via the official form.
Step-by-step agent narrative (for judges and your paper): **[`AGENT_WORKFLOW.md`](AGENT_WORKFLOW.md)**.
### Deliverables checklist
- [ ] Working URL or Docker image for this API (judges use POST endpoints below).
- [ ] GitHub (or equivalent) with this repo; do not commit `.env` or Yelp raw JSON.
- [ ] Solution paper PDF (4–8 pages): point to `AGENT_WORKFLOW.md` for architecture; add experiments (e.g. RAG on/off, Nigerian prompt on/off), limits, Nigerian English design note.
- [ ] Disclosures in paper: base HF models, Yelp-derived data / RAG index, embedding catalog build.
### Endpoints
| Method | Path |
|--------|------|
| GET | `/health`, `/` |
| POST | `/user-modeling` (aliases: `/task-1`, `/task_a`) |
| POST | `/recommendation` (aliases: `/task-2`, `/task_b`) |
### Request bodies
**Task 1:** `{"persona": "<multiline user snapshot; optional line user_id: ...>", "product": "<business facts>", "include_raw": false}` — response includes `rag_snippets_used`.
**Task 2:** `{"persona": "...", "city": null, "state": null, "chat_history": [], "top_k_retrieval": 40, "top_n_final": 10}`
### Local run (clone this repo)
From the **repository root** (this folder):
```bash
cp env.example .env
pip install -r requirements.txt
```
**Task A review index** (Yelp `review.json` + `business.json`):
```bash
python scripts/build_task_a_review_rag.py \
--review-json path/to/yelp_academic_dataset_review.json \
--business-json path/to/yelp_academic_dataset_business.json \
--output data/task_a_reviews_embedded.jsonl \
--max-rows 12000
```
Use the same `TASK_B_LOCAL_EMBEDDING_MODEL` (or `TASK_A_EMBEDDING_MODEL`) at build and runtime. Omit the file only for quick tests (generation runs without RAG).
**Generation:** set `GEMINI_API_KEY` in `.env` (see `env.example`). With `GENERATION_BACKEND=gemini` or `auto` (default), Task A and Task B both use **Gemini**. Local causal LLM inference is not used by current runtime code.
**Task B** reranking uses Gemini; embeddings stay local (`LOCAL_EMBEDDING_MODEL`).
**Recommendation index** (needs Yelp `business.json` on your machine, e.g. `../yelp_dataset/extracted/` from a parent workspace):
```bash
python scripts/build_business_catalog.py --max-rows 30000 --only-open
python scripts/embed_catalog.py --batch-size 64
```
Use the same `TASK_B_LOCAL_EMBEDDING_MODEL` for `embed_catalog.py` and at API runtime.
**Start API:**
```bash
uvicorn app.main:app --host 0.0.0.0 --port 8080
# or: PORT=8080 python -m app.main
```
### Docker
Build with Hub token available **during build** (anonymous works for public models but hits rate limits):
```bash
docker build -t dcn-llm-agent-challenge \
--build-arg HF_TOKEN="$HF_TOKEN" \
--build-arg HUGGING_FACE_HUB_TOKEN="$HUGGING_FACE_HUB_TOKEN" .
docker run --env-file .env -p 7860:7860 dcn-llm-agent-challenge
```
```bash
export HF_TOKEN=hf_... # optional; must be visible to `docker build`, not only the container
docker compose up --build -d
```
Default compose maps **`7860:7860`**. The image bakes **`/code/data/business_catalog_embedded.jsonl`** and **`/code/data/task_a_reviews_embedded.jsonl`** at build time (or stubs if Yelp JSON is missing). Override with a bind mount, e.g. `./data:/code/data`, if you rebuild those files locally.
The Docker image sets **`HF_HUB_OFFLINE=1`** and **`TRANSFORMERS_OFFLINE=1`** so the running container does not call the Hugging Face Hub. During **`docker build`**, **`snapshot_download`** copies model **files** into `/models/huggingface` (and stub JSONL is embedded). Loading weights **into RAM** during build was disabled by default (**`DOCKER_BUILD_SKIP_LLM_WARM=1`**) because HF build VMs often **OOM (exit 137)** when loading Qwen; that RAM would not stay in the final image anyway.
At **container start**, **`STARTUP_PREWARM=all`** (default) loads the shared embedding model and preloads Task A RAG + Task B catalog indexes. Expect **~1–2 minutes** on CPU after deploy while logs show `Loading shared …`; then both endpoints stay fast. Disable with **`SKIP_STARTUP_PREWARM=1`** (not recommended on Spaces).
### Smoke checks
OpenAPI: `http://localhost:7860/docs` when using Docker (port **7860**). Local `uvicorn` defaults to **8080** unless you set `PORT`.
### Layout
| Path | Role |
|------|------|
| `app/main.py` | FastAPI routes |
| [`AGENT_WORKFLOW.md`](AGENT_WORKFLOW.md) | Agent steps, reproducibility, paper hooks (Nigerian English, fallbacks) |
| `app/user_modeling.py`, `app/user_modeling_prompt.py`, `app/task_a_rag.py` | Task 1 Gemini generation + Yelp review RAG |
| `app/recommendation_pipeline.py` | Task 2 retrieval + rerank |
| `scripts/build_business_catalog.py` | Yelp → catalog JSONL |
| `scripts/embed_catalog.py` | Embed catalog (local sentence-transformers) |
| `scripts/build_task_a_review_rag.py` | Yelp reviews (+ businesses) → Task A embedded RAG JSONL |
| `scripts/docker_build_assets.py` | Docker build: HF prefetch + catalog + Task A RAG |
| `env.example` | Copy to `.env` |
| `NOTICES.txt` | Data / cloud disclosures |
Optional: container bind-mount Yelp `review.json` + `business.json` at build time so Docker bakes real Task A/B indexes instead of stubs.
|