title: DSN
emoji: 🏢
colorFrom: indigo
colorTo: red
sdk: docker
pinned: false
license: mit
short_description: DSN HACKATHON
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
This Space is configured as sdk: docker. The image builds from Dockerfile (CPU-only PyTorch so CUDA wheels don’t OOM the builder). During docker build, models are **snapshot_download**’d into /models/huggingface without loading the full LLM into RAM; SentenceTransformer embeds a stub or Yelp-derived catalog plus data/task_a_reviews_embedded.jsonl (review RAG for Task A). See scripts/docker_build_assets.py.
Task A: persona + product → rating/review using a local causal LM (default Qwen2.5-1.5B-Instruct) and retrieved Yelp review snippets from the baked JSONL (semantic + optional user_id: match). Task B: local sentence-transformer retrieval over businesses plus local causal LM reranking.
Secrets: Optional HF_TOKEN / HUGGING_FACE_HUB_TOKEN for Hugging Face Hub during docker build (runtime secrets alone often do not reach docker build). On Spaces: add the token under Space secrets and enable build-time / Docker build args if your UI offers it; locally use docker build --build-arg HF_TOKEN=.... Never commit tokens.
DSN × BCT LLM Agent Challenge — API package
Deadline: 24 May 2026 end of day (organiser time). Submit solution paper + repo + container link via the official form.
Step-by-step agent narrative (for judges and your paper): AGENT_WORKFLOW.md.
Deliverables checklist
- Working URL or Docker image for this API (judges use POST endpoints below).
- GitHub (or equivalent) with this repo; do not commit
.envor Yelp raw JSON. - Solution paper PDF (4–8 pages): point to
AGENT_WORKFLOW.mdfor architecture; add experiments (e.g. RAG on/off, Nigerian prompt on/off), limits, Nigerian English design note. - Disclosures in paper: base HF models, Yelp-derived data / RAG index, embedding catalog build.
Endpoints
| Method | Path |
|---|---|
| GET | /health, / |
| POST | /user-modeling (aliases: /task-1, /task_a) |
| POST | /recommendation (aliases: /task-2, /task_b) |
Request bodies
Task 1: {"persona": "<multiline user snapshot; optional line user_id: ...>", "product": "<business facts>", "include_raw": false} — response includes rag_snippets_used.
Task 2: {"persona": "...", "city": null, "state": null, "chat_history": [], "top_k_retrieval": 40, "top_n_final": 10}
Local run (clone this repo)
From the repository root (this folder):
cp env.example .env
pip install -r requirements.txt
Task A review index (Yelp review.json + business.json):
python scripts/build_task_a_review_rag.py \
--review-json path/to/yelp_academic_dataset_review.json \
--business-json path/to/yelp_academic_dataset_business.json \
--output data/task_a_reviews_embedded.jsonl \
--max-rows 12000
Use the same TASK_B_LOCAL_EMBEDDING_MODEL (or TASK_A_EMBEDDING_MODEL) at build and runtime. Omit the file only for quick tests (generation runs without RAG).
Task B uses TASK_B_LOCAL_LLM_MODEL for reranking (default Qwen2.5-1.5B-Instruct; first run may download weights from Hugging Face).
Recommendation index (needs Yelp business.json on your machine, e.g. ../yelp_dataset/extracted/ from a parent workspace):
python scripts/build_business_catalog.py --max-rows 30000 --only-open
python scripts/embed_catalog.py --batch-size 64
Use the same TASK_B_LOCAL_EMBEDDING_MODEL for embed_catalog.py and at API runtime.
Start API:
uvicorn app.main:app --host 0.0.0.0 --port 8080
# or: PORT=8080 python -m app.main
Docker
Build with Hub token available during build (anonymous works for public models but hits rate limits):
docker build -t dcn-llm-agent-challenge \
--build-arg HF_TOKEN="$HF_TOKEN" \
--build-arg HUGGING_FACE_HUB_TOKEN="$HUGGING_FACE_HUB_TOKEN" .
docker run --env-file .env -p 7860:7860 dcn-llm-agent-challenge
export HF_TOKEN=hf_... # optional; must be visible to `docker build`, not only the container
docker compose up --build -d
Default compose maps 7860:7860. The image bakes /code/data/business_catalog_embedded.jsonl and /code/data/task_a_reviews_embedded.jsonl at build time (or stubs if Yelp JSON is missing). Override with a bind mount, e.g. ./data:/code/data, if you rebuild those files locally.
The Docker image sets HF_HUB_OFFLINE=1 and TRANSFORMERS_OFFLINE=1 so the running container does not call the Hugging Face Hub. During docker build, snapshot_download copies model files into /models/huggingface (and stub JSONL is embedded). Loading weights into RAM during build was disabled by default (DOCKER_BUILD_SKIP_LLM_WARM=1) because HF build VMs often OOM (exit 137) when loading Qwen; that RAM would not stay in the final image anyway.
At container start, STARTUP_PREWARM=all (default) loads one shared embedding model and one shared causal LM (app/shared_models.py), then Task A RAG + Task B catalog — so /task-2 does not pay a second full Qwen load. Expect ~1–2 minutes on CPU after deploy while logs show Loading shared …; then both endpoints stay fast. Disable with SKIP_STARTUP_PREWARM=1 (not recommended on Spaces).
Smoke checks
OpenAPI: http://localhost:7860/docs when using Docker (port 7860). Local uvicorn defaults to 8080 unless you set PORT.
Layout
| Path | Role |
|---|---|
app/main.py |
FastAPI routes |
AGENT_WORKFLOW.md |
Agent steps, reproducibility, paper hooks (Nigerian English, fallbacks) |
app/user_modeling.py, app/user_modeling_prompt.py, app/task_a_rag.py |
Task 1 local LLM + Yelp review RAG |
app/recommendation_pipeline.py |
Task 2 retrieval + rerank |
scripts/build_business_catalog.py |
Yelp → catalog JSONL |
scripts/embed_catalog.py |
Embed catalog (local sentence-transformers) |
scripts/build_task_a_review_rag.py |
Yelp reviews (+ businesses) → Task A embedded RAG JSONL |
scripts/docker_build_assets.py |
Docker build: HF prefetch + catalog + Task A RAG |
env.example |
Copy to .env |
NOTICES.txt |
Data / cloud disclosures |
Optional: container bind-mount Yelp review.json + business.json at build time so Docker bakes real Task A/B indexes instead of stubs.