Spaces:

nexusbert
/

DSN

Sleeping

File size: 6,507 Bytes

---
title: DSN
emoji: 🏢
colorFrom: indigo
colorTo: red
sdk: docker
pinned: false
license: mit
short_description: DSN HACKATHON
---

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

This Space is configured as **`sdk: docker`**. The image builds from `Dockerfile` (CPU-only PyTorch so CUDA wheels don’t OOM the builder). During **`docker build`**, models are **`snapshot_download`**’d into `/models/huggingface` **without loading the full LLM into RAM**; **`SentenceTransformer`** embeds a **stub** or Yelp-derived catalog plus **`data/task_a_reviews_embedded.jsonl`** (review RAG for Task A). See `scripts/docker_build_assets.py`.

Task **A**: persona + product → rating/review via **Gemini API** and retrieved Yelp review snippets from the baked JSONL. Task **B**: local sentence-transformer retrieval over businesses plus **Gemini** reranking.

**Secrets (Hugging Face Space):** **`GEMINI_API_KEY`** (or `GOOGLE_API_KEY`) — required for generation when `GENERATION_BACKEND=gemini`. Optional **`HF_TOKEN`** for **Docker build** only (embedder download). Never commit keys in the repo.

---

## DSN × BCT LLM Agent Challenge — API package

**Deadline:** 24 May 2026 end of day (organiser time). Submit solution paper + repo + container link via the official form.

Step-by-step agent narrative (for judges and your paper): **[`AGENT_WORKFLOW.md`](AGENT_WORKFLOW.md)**.

### Deliverables checklist

- [ ] Working URL or Docker image for this API (judges use POST endpoints below).
- [ ] GitHub (or equivalent) with this repo; do not commit `.env` or Yelp raw JSON.
- [ ] Solution paper PDF (4–8 pages): point to `AGENT_WORKFLOW.md` for architecture; add experiments (e.g. RAG on/off, Nigerian prompt on/off), limits, Nigerian English design note.
- [ ] Disclosures in paper: base HF models, Yelp-derived data / RAG index, embedding catalog build.

### Endpoints

| Method | Path |
|--------|------|
| GET | `/health`, `/` |
| POST | `/user-modeling` (aliases: `/task-1`, `/task_a`) |
| POST | `/recommendation` (aliases: `/task-2`, `/task_b`) |

### Request bodies

**Task 1:** `{"persona": "<multiline user snapshot; optional line user_id: ...>", "product": "<business facts>", "include_raw": false}` — response includes `rag_snippets_used`.

**Task 2:** `{"persona": "...", "city": null, "state": null, "chat_history": [], "top_k_retrieval": 40, "top_n_final": 10}`

### Local run (clone this repo)

From the **repository root** (this folder):

```bash
cp env.example .env
pip install -r requirements.txt
```

**Task A review index** (Yelp `review.json` + `business.json`):

```bash
python scripts/build_task_a_review_rag.py \
  --review-json path/to/yelp_academic_dataset_review.json \
  --business-json path/to/yelp_academic_dataset_business.json \
  --output data/task_a_reviews_embedded.jsonl \
  --max-rows 12000
```

Use the same `TASK_B_LOCAL_EMBEDDING_MODEL` (or `TASK_A_EMBEDDING_MODEL`) at build and runtime. Omit the file only for quick tests (generation runs without RAG).

**Generation:** set `GEMINI_API_KEY` in `.env` (see `env.example`). With `GENERATION_BACKEND=gemini` or `auto` (default), Task A and Task B both use **Gemini**. Local causal LLM inference is not used by current runtime code.

**Task B** reranking uses Gemini; embeddings stay local (`LOCAL_EMBEDDING_MODEL`).

**Recommendation index** (needs Yelp `business.json` on your machine, e.g. `../yelp_dataset/extracted/` from a parent workspace):

```bash
python scripts/build_business_catalog.py --max-rows 30000 --only-open
python scripts/embed_catalog.py --batch-size 64
```

Use the same `TASK_B_LOCAL_EMBEDDING_MODEL` for `embed_catalog.py` and at API runtime.

**Start API:**

```bash
uvicorn app.main:app --host 0.0.0.0 --port 8080
# or: PORT=8080 python -m app.main
```

### Docker

Build with Hub token available **during build** (anonymous works for public models but hits rate limits):

```bash
docker build -t dcn-llm-agent-challenge \
  --build-arg HF_TOKEN="$HF_TOKEN" \
  --build-arg HUGGING_FACE_HUB_TOKEN="$HUGGING_FACE_HUB_TOKEN" .
docker run --env-file .env -p 7860:7860 dcn-llm-agent-challenge
```

```bash
export HF_TOKEN=hf_...   # optional; must be visible to `docker build`, not only the container
docker compose up --build -d
```

Default compose maps **`7860:7860`**. The image bakes **`/code/data/business_catalog_embedded.jsonl`** and **`/code/data/task_a_reviews_embedded.jsonl`** at build time (or stubs if Yelp JSON is missing). Override with a bind mount, e.g. `./data:/code/data`, if you rebuild those files locally.

The Docker image sets **`HF_HUB_OFFLINE=1`** and **`TRANSFORMERS_OFFLINE=1`** so the running container does not call the Hugging Face Hub. During **`docker build`**, **`snapshot_download`** copies model **files** into `/models/huggingface` (and stub JSONL is embedded). Loading weights **into RAM** during build was disabled by default (**`DOCKER_BUILD_SKIP_LLM_WARM=1`**) because HF build VMs often **OOM (exit 137)** when loading Qwen; that RAM would not stay in the final image anyway.

At **container start**, **`STARTUP_PREWARM=all`** (default) loads the shared embedding model and preloads Task A RAG + Task B catalog indexes. Expect **~1–2 minutes** on CPU after deploy while logs show `Loading shared …`; then both endpoints stay fast. Disable with **`SKIP_STARTUP_PREWARM=1`** (not recommended on Spaces).

### Smoke checks

OpenAPI: `http://localhost:7860/docs` when using Docker (port **7860**). Local `uvicorn` defaults to **8080** unless you set `PORT`.

### Layout

| Path | Role |
|------|------|
| `app/main.py` | FastAPI routes |
| [`AGENT_WORKFLOW.md`](AGENT_WORKFLOW.md) | Agent steps, reproducibility, paper hooks (Nigerian English, fallbacks) |
| `app/user_modeling.py`, `app/user_modeling_prompt.py`, `app/task_a_rag.py` | Task 1 Gemini generation + Yelp review RAG |
| `app/recommendation_pipeline.py` | Task 2 retrieval + rerank |
| `scripts/build_business_catalog.py` | Yelp → catalog JSONL |
| `scripts/embed_catalog.py` | Embed catalog (local sentence-transformers) |
| `scripts/build_task_a_review_rag.py` | Yelp reviews (+ businesses) → Task A embedded RAG JSONL |
| `scripts/docker_build_assets.py` | Docker build: HF prefetch + catalog + Task A RAG |
| `env.example` | Copy to `.env` |
| `NOTICES.txt` | Data / cloud disclosures |

Optional: container bind-mount Yelp `review.json` + `business.json` at build time so Docker bakes real Task A/B indexes instead of stubs.