DSN / README.md
nexusbert's picture
Enhance agent workflow and integration of Gemini API for text generation
652302c
---
title: DSN
emoji: 🏒
colorFrom: indigo
colorTo: red
sdk: docker
pinned: false
license: mit
short_description: DSN HACKATHON
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
This Space is configured as **`sdk: docker`**. The image builds from `Dockerfile` (CPU-only PyTorch so CUDA wheels don’t OOM the builder). During **`docker build`**, models are **`snapshot_download`**’d into `/models/huggingface` **without loading the full LLM into RAM**; **`SentenceTransformer`** embeds a **stub** or Yelp-derived catalog plus **`data/task_a_reviews_embedded.jsonl`** (review RAG for Task A). See `scripts/docker_build_assets.py`.
Task **A**: persona + product β†’ rating/review via **Gemini API** (default) or optional **local** Qwen, plus **retrieved Yelp review snippets** from the baked JSONL. Task **B**: **local** sentence-transformer retrieval over businesses plus **Gemini** (or local) reranking.
**Secrets (Hugging Face Space):** **`GEMINI_API_KEY`** (or `GOOGLE_API_KEY`) β€” required for generation when `GENERATION_BACKEND=gemini`. Optional **`HF_TOKEN`** for **Docker build** only (embedder download). Never commit keys in the repo.
---
## DSN Γ— BCT LLM Agent Challenge β€” API package
**Deadline:** 24 May 2026 end of day (organiser time). Submit solution paper + repo + container link via the official form.
Step-by-step agent narrative (for judges and your paper): **[`AGENT_WORKFLOW.md`](AGENT_WORKFLOW.md)**.
### Deliverables checklist
- [ ] Working URL or Docker image for this API (judges use POST endpoints below).
- [ ] GitHub (or equivalent) with this repo; do not commit `.env` or Yelp raw JSON.
- [ ] Solution paper PDF (4–8 pages): point to `AGENT_WORKFLOW.md` for architecture; add experiments (e.g. RAG on/off, Nigerian prompt on/off), limits, Nigerian English design note.
- [ ] Disclosures in paper: base HF models, Yelp-derived data / RAG index, embedding catalog build.
### Endpoints
| Method | Path |
|--------|------|
| GET | `/health`, `/` |
| POST | `/user-modeling` (aliases: `/task-1`, `/task_a`) |
| POST | `/recommendation` (aliases: `/task-2`, `/task_b`) |
### Request bodies
**Task 1:** `{"persona": "<multiline user snapshot; optional line user_id: ...>", "product": "<business facts>", "include_raw": false}` β€” response includes `rag_snippets_used`.
**Task 2:** `{"persona": "...", "city": null, "state": null, "chat_history": [], "top_k_retrieval": 40, "top_n_final": 10}`
### Local run (clone this repo)
From the **repository root** (this folder):
```bash
cp env.example .env
pip install -r requirements.txt
```
**Task A review index** (Yelp `review.json` + `business.json`):
```bash
python scripts/build_task_a_review_rag.py \
--review-json path/to/yelp_academic_dataset_review.json \
--business-json path/to/yelp_academic_dataset_business.json \
--output data/task_a_reviews_embedded.jsonl \
--max-rows 12000
```
Use the same `TASK_B_LOCAL_EMBEDDING_MODEL` (or `TASK_A_EMBEDDING_MODEL`) at build and runtime. Omit the file only for quick tests (generation runs without RAG).
**Generation:** set `GEMINI_API_KEY` in `.env` (see `env.example`). With `GENERATION_BACKEND=gemini` (default), Task A and Task B use **`GEMINI_MODEL`** (default `gemini-2.0-flash`). Set `GENERATION_BACKEND=local` to use on-device Qwen instead.
**Task B** reranking uses Gemini when configured; embeddings stay local (`LOCAL_EMBEDDING_MODEL`).
**Recommendation index** (needs Yelp `business.json` on your machine, e.g. `../yelp_dataset/extracted/` from a parent workspace):
```bash
python scripts/build_business_catalog.py --max-rows 30000 --only-open
python scripts/embed_catalog.py --batch-size 64
```
Use the same `TASK_B_LOCAL_EMBEDDING_MODEL` for `embed_catalog.py` and at API runtime.
**Start API:**
```bash
uvicorn app.main:app --host 0.0.0.0 --port 8080
# or: PORT=8080 python -m app.main
```
### Docker
Build with Hub token available **during build** (anonymous works for public models but hits rate limits):
```bash
docker build -t dcn-llm-agent-challenge \
--build-arg HF_TOKEN="$HF_TOKEN" \
--build-arg HUGGING_FACE_HUB_TOKEN="$HUGGING_FACE_HUB_TOKEN" .
docker run --env-file .env -p 7860:7860 dcn-llm-agent-challenge
```
```bash
export HF_TOKEN=hf_... # optional; must be visible to `docker build`, not only the container
docker compose up --build -d
```
Default compose maps **`7860:7860`**. The image bakes **`/code/data/business_catalog_embedded.jsonl`** and **`/code/data/task_a_reviews_embedded.jsonl`** at build time (or stubs if Yelp JSON is missing). Override with a bind mount, e.g. `./data:/code/data`, if you rebuild those files locally.
The Docker image sets **`HF_HUB_OFFLINE=1`** and **`TRANSFORMERS_OFFLINE=1`** so the running container does not call the Hugging Face Hub. During **`docker build`**, **`snapshot_download`** copies model **files** into `/models/huggingface` (and stub JSONL is embedded). Loading weights **into RAM** during build was disabled by default (**`DOCKER_BUILD_SKIP_LLM_WARM=1`**) because HF build VMs often **OOM (exit 137)** when loading Qwen; that RAM would not stay in the final image anyway.
At **container start**, **`STARTUP_PREWARM=all`** (default) loads **one shared** embedding model and **one shared** causal LM (`app/shared_models.py`), then Task A RAG + Task B catalog β€” so **`/task-2`** does not pay a second full Qwen load. Expect **~1–2 minutes** on CPU after deploy while logs show `Loading shared …`; then both endpoints stay fast. Disable with **`SKIP_STARTUP_PREWARM=1`** (not recommended on Spaces).
### Smoke checks
OpenAPI: `http://localhost:7860/docs` when using Docker (port **7860**). Local `uvicorn` defaults to **8080** unless you set `PORT`.
### Layout
| Path | Role |
|------|------|
| `app/main.py` | FastAPI routes |
| [`AGENT_WORKFLOW.md`](AGENT_WORKFLOW.md) | Agent steps, reproducibility, paper hooks (Nigerian English, fallbacks) |
| `app/user_modeling.py`, `app/user_modeling_prompt.py`, `app/task_a_rag.py` | Task 1 local LLM + Yelp review RAG |
| `app/recommendation_pipeline.py` | Task 2 retrieval + rerank |
| `scripts/build_business_catalog.py` | Yelp β†’ catalog JSONL |
| `scripts/embed_catalog.py` | Embed catalog (local sentence-transformers) |
| `scripts/build_task_a_review_rag.py` | Yelp reviews (+ businesses) β†’ Task A embedded RAG JSONL |
| `scripts/docker_build_assets.py` | Docker build: HF prefetch + catalog + Task A RAG |
| `env.example` | Copy to `.env` |
| `NOTICES.txt` | Data / cloud disclosures |
Optional: container bind-mount Yelp `review.json` + `business.json` at build time so Docker bakes real Task A/B indexes instead of stubs.