File size: 6,507 Bytes
6786bbd
 
 
 
 
 
 
 
 
 
 
 
73c205a
1c181b2
73c205a
d47b370
c78886f
652302c
73c205a
 
 
 
 
 
 
1c181b2
 
73c205a
 
 
 
1c181b2
 
73c205a
 
 
 
 
 
 
 
 
 
 
1c181b2
73c205a
 
 
 
 
 
 
 
 
 
 
 
1c181b2
 
 
 
 
 
 
 
 
 
 
 
d47b370
652302c
d47b370
73c205a
 
 
 
 
1c181b2
73c205a
 
1c181b2
73c205a
 
 
 
 
 
 
 
 
 
be705e8
c78886f
73c205a
be705e8
 
 
c78886f
73c205a
 
 
be705e8
c78886f
73c205a
 
1c181b2
c78886f
0bf3001
10bc91f
d47b370
10bc91f
73c205a
 
c78886f
73c205a
 
 
 
 
 
1c181b2
d47b370
73c205a
 
1c181b2
 
 
73c205a
 
 
1c181b2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
---
title: DSN
emoji: 🏢
colorFrom: indigo
colorTo: red
sdk: docker
pinned: false
license: mit
short_description: DSN HACKATHON
---

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

This Space is configured as **`sdk: docker`**. The image builds from `Dockerfile` (CPU-only PyTorch so CUDA wheels don’t OOM the builder). During **`docker build`**, models are **`snapshot_download`**’d into `/models/huggingface` **without loading the full LLM into RAM**; **`SentenceTransformer`** embeds a **stub** or Yelp-derived catalog plus **`data/task_a_reviews_embedded.jsonl`** (review RAG for Task A). See `scripts/docker_build_assets.py`.

Task **A**: persona + product → rating/review via **Gemini API** and retrieved Yelp review snippets from the baked JSONL. Task **B**: local sentence-transformer retrieval over businesses plus **Gemini** reranking.

**Secrets (Hugging Face Space):** **`GEMINI_API_KEY`** (or `GOOGLE_API_KEY`) — required for generation when `GENERATION_BACKEND=gemini`. Optional **`HF_TOKEN`** for **Docker build** only (embedder download). Never commit keys in the repo.

---

## DSN × BCT LLM Agent Challenge — API package

**Deadline:** 24 May 2026 end of day (organiser time). Submit solution paper + repo + container link via the official form.

Step-by-step agent narrative (for judges and your paper): **[`AGENT_WORKFLOW.md`](AGENT_WORKFLOW.md)**.

### Deliverables checklist

- [ ] Working URL or Docker image for this API (judges use POST endpoints below).
- [ ] GitHub (or equivalent) with this repo; do not commit `.env` or Yelp raw JSON.
- [ ] Solution paper PDF (4–8 pages): point to `AGENT_WORKFLOW.md` for architecture; add experiments (e.g. RAG on/off, Nigerian prompt on/off), limits, Nigerian English design note.
- [ ] Disclosures in paper: base HF models, Yelp-derived data / RAG index, embedding catalog build.

### Endpoints

| Method | Path |
|--------|------|
| GET | `/health`, `/` |
| POST | `/user-modeling` (aliases: `/task-1`, `/task_a`) |
| POST | `/recommendation` (aliases: `/task-2`, `/task_b`) |

### Request bodies

**Task 1:** `{"persona": "<multiline user snapshot; optional line user_id: ...>", "product": "<business facts>", "include_raw": false}` — response includes `rag_snippets_used`.

**Task 2:** `{"persona": "...", "city": null, "state": null, "chat_history": [], "top_k_retrieval": 40, "top_n_final": 10}`

### Local run (clone this repo)

From the **repository root** (this folder):

```bash
cp env.example .env
pip install -r requirements.txt
```

**Task A review index** (Yelp `review.json` + `business.json`):

```bash
python scripts/build_task_a_review_rag.py \
  --review-json path/to/yelp_academic_dataset_review.json \
  --business-json path/to/yelp_academic_dataset_business.json \
  --output data/task_a_reviews_embedded.jsonl \
  --max-rows 12000
```

Use the same `TASK_B_LOCAL_EMBEDDING_MODEL` (or `TASK_A_EMBEDDING_MODEL`) at build and runtime. Omit the file only for quick tests (generation runs without RAG).

**Generation:** set `GEMINI_API_KEY` in `.env` (see `env.example`). With `GENERATION_BACKEND=gemini` or `auto` (default), Task A and Task B both use **Gemini**. Local causal LLM inference is not used by current runtime code.

**Task B** reranking uses Gemini; embeddings stay local (`LOCAL_EMBEDDING_MODEL`).

**Recommendation index** (needs Yelp `business.json` on your machine, e.g. `../yelp_dataset/extracted/` from a parent workspace):

```bash
python scripts/build_business_catalog.py --max-rows 30000 --only-open
python scripts/embed_catalog.py --batch-size 64
```

Use the same `TASK_B_LOCAL_EMBEDDING_MODEL` for `embed_catalog.py` and at API runtime.

**Start API:**

```bash
uvicorn app.main:app --host 0.0.0.0 --port 8080
# or: PORT=8080 python -m app.main
```

### Docker

Build with Hub token available **during build** (anonymous works for public models but hits rate limits):

```bash
docker build -t dcn-llm-agent-challenge \
  --build-arg HF_TOKEN="$HF_TOKEN" \
  --build-arg HUGGING_FACE_HUB_TOKEN="$HUGGING_FACE_HUB_TOKEN" .
docker run --env-file .env -p 7860:7860 dcn-llm-agent-challenge
```

```bash
export HF_TOKEN=hf_...   # optional; must be visible to `docker build`, not only the container
docker compose up --build -d
```

Default compose maps **`7860:7860`**. The image bakes **`/code/data/business_catalog_embedded.jsonl`** and **`/code/data/task_a_reviews_embedded.jsonl`** at build time (or stubs if Yelp JSON is missing). Override with a bind mount, e.g. `./data:/code/data`, if you rebuild those files locally.

The Docker image sets **`HF_HUB_OFFLINE=1`** and **`TRANSFORMERS_OFFLINE=1`** so the running container does not call the Hugging Face Hub. During **`docker build`**, **`snapshot_download`** copies model **files** into `/models/huggingface` (and stub JSONL is embedded). Loading weights **into RAM** during build was disabled by default (**`DOCKER_BUILD_SKIP_LLM_WARM=1`**) because HF build VMs often **OOM (exit 137)** when loading Qwen; that RAM would not stay in the final image anyway.

At **container start**, **`STARTUP_PREWARM=all`** (default) loads the shared embedding model and preloads Task A RAG + Task B catalog indexes. Expect **~1–2 minutes** on CPU after deploy while logs show `Loading shared …`; then both endpoints stay fast. Disable with **`SKIP_STARTUP_PREWARM=1`** (not recommended on Spaces).

### Smoke checks

OpenAPI: `http://localhost:7860/docs` when using Docker (port **7860**). Local `uvicorn` defaults to **8080** unless you set `PORT`.

### Layout

| Path | Role |
|------|------|
| `app/main.py` | FastAPI routes |
| [`AGENT_WORKFLOW.md`](AGENT_WORKFLOW.md) | Agent steps, reproducibility, paper hooks (Nigerian English, fallbacks) |
| `app/user_modeling.py`, `app/user_modeling_prompt.py`, `app/task_a_rag.py` | Task 1 Gemini generation + Yelp review RAG |
| `app/recommendation_pipeline.py` | Task 2 retrieval + rerank |
| `scripts/build_business_catalog.py` | Yelp → catalog JSONL |
| `scripts/embed_catalog.py` | Embed catalog (local sentence-transformers) |
| `scripts/build_task_a_review_rag.py` | Yelp reviews (+ businesses) → Task A embedded RAG JSONL |
| `scripts/docker_build_assets.py` | Docker build: HF prefetch + catalog + Task A RAG |
| `env.example` | Copy to `.env` |
| `NOTICES.txt` | Data / cloud disclosures |

Optional: container bind-mount Yelp `review.json` + `business.json` at build time so Docker bakes real Task A/B indexes instead of stubs.