# Contributing to why-agent Thank you for your interest in contributing! This guide covers the development setup, testing workflow, and code quality standards. --- ## Prerequisites - **Python 3.12+** (check `.python-version`) - **uv** — modern Python package manager ([install](https://docs.astral.sh/uv/)) - **Node.js 20+** — for the Next.js frontend (optional, only if modifying frontend) - **Git** — for version control --- ## Development Setup ### 1. Clone and install dependencies ```bash git clone https://github.com/Isa-Mapo-Hackathon/why-agent.git cd why-agent uv sync ``` This installs both runtime and dev dependencies (pytest, ruff, pyright). ### 2. Set up environment ```bash cp .env.example .env ``` Then edit `.env` with your secrets: - `MODEL_BACKEND` — use `minimax` or `replay` for local development - `MINIMAX_API_KEY` — get from [MiniMax dashboard](https://platform.minimaxi.chat/) - `PARQUET_DIR` — defaults to `data/parquet` - `SEMANTIC_LAYER_PATH` — defaults to `data/semantic_layer.yml` ### 3. Verify setup ```bash uv run pytest -v ``` Should run ~15+ tests without errors. --- ## Running the Application ### Option A: Streamlit (Python-only, simplest) ```bash uv run streamlit run streamlit_app.py ``` Opens at `http://localhost:8501`. Uses the Streamlit UI to ask questions directly to the agent. ### Option B: FastAPI + Next.js (full stack) **Terminal 1 — FastAPI backend (with hot reload):** ```bash uv run uvicorn client.backend.main:app --reload --port 8000 ``` Backend runs at `http://localhost:8000`. Check health at `http://localhost:8000/api/health`. **Terminal 2 — Next.js frontend:** ```bash cd client/frontend npm install # first time only npm run dev ``` Frontend runs at `http://localhost:3000`. The Next.js dev server proxies `/api/*` to the FastAPI backend on port 8000 automatically. --- ## Common Development Commands | Task | Command | |------|---------| | **Install deps** | `uv sync` | | **Add a dependency** | `uv add ` (runtime) or `uv add --dev ` (dev) | | **Run tests** | `uv run pytest -v` | | **Run one test file** | `uv run pytest tests/test_agent_smoke.py -v` | | **Lint code** | `uv run ruff check --fix` | | **Format code** | `uv run ruff format` | | **Type check** (optional) | `uv run pyright` | | **Run Streamlit** | `uv run streamlit run streamlit_app.py` | | **Run FastAPI backend** | `uv run uvicorn client.backend.main:app --reload --port 8000` | | **Run Next.js frontend** | `cd client/frontend && npm run dev` | --- ## Testing ### Philosophy Tests are **smoke tests**, not unit tests. We verify: - Tools run without crashing - Output has the expected shape (JSON, dict keys, etc.) - Error handling is recoverable We do **not** mock heavily or test implementation details. ### Running tests ```bash # All tests uv run pytest # Single file uv run pytest tests/test_tools.py -v # Single test uv run pytest tests/test_tools.py::test_inspect_schema -v # With print output uv run pytest -s ``` ### Adding a test 1. Add a `.py` file in `tests/` or `client/backend/tests/` 2. Write a function named `test_*` 3. Use `assert` statements 4. Run `uv run pytest` to verify Example: ```python def test_my_feature(): from agent.tools import run_sql result = run_sql(...) assert "rows" in result assert isinstance(result["rows"], list) ``` --- ## Code Quality Before any commit, code must pass: ```bash uv run ruff check --fix # Fix lint errors automatically uv run ruff format # Format to standard style ``` These two commands are **required** — CI will reject commits that don't pass. Optional (not in CI, but recommended): ```bash uv run pyright # Type checking (editor runs this too) ``` --- ## Repository Structure ``` why-agent/ ├── agent/ # Core agent logic │ ├── graph.py # LangGraph state machine │ ├── state.py # Pydantic state models │ ├── client.py # Multi-backend LLM client │ ├── constants.py # Named constants (backends, tool names, demo questions) │ ├── tools/ # The four tools │ │ ├── inspect_schema.py │ │ ├── run_sql.py │ │ ├── compare_periods.py │ │ └── decompose_metric.py │ └── prompts/ # System + critique prompts │ ├── client/ │ ├── backend/ # FastAPI server │ │ ├── main.py # GET /health, POST /api/investigate │ │ ├── deps.py # Dependency injection (graph instance) │ │ ├── sse.py # Server-Sent Events formatting │ │ └── tests/ │ └── frontend/ # Next.js app │ ├── src/app/page.tsx # Main page │ └── package.json │ ├── data/ │ ├── parquet/ # Dataset files (gitignored) │ └── semantic_layer.yml # Metadata + business context │ ├── tests/ # Python smoke tests │ ├── test_tools.py │ ├── test_client_backends.py │ └── test_agent_smoke.py │ ├── docs/ # Documentation │ ├── CONTRIBUTING.md # This file │ ├── RUNBOOK.md # Deployment guide │ └── why-agent-architecture.png │ ├── streamlit_app.py # Standalone Streamlit UI ├── pyproject.toml # Python deps + commands ├── docker/ # Containers │ ├── Dockerfile # Multi-stage build │ ├── entrypoint.sh # HF Spaces boot script │ ├── nginx.conf # Reverse proxy config │ └── supervisord.conf # Process management │ └── README.md # Project overview + business context ``` --- ## Architecture Overview ``` ┌─────────────────────────────────┐ │ Streamlit UI │ │ (streamlit_app.py) │ └────────────┬────────────────────┘ │ ┌─────┴──────┐ │ │ ▼ ▼ ┌──────────────┐ ┌──────────────────┐ │ Next.js │ │ FastAPI Backend │ │ (client/ │ │ (client/backend/ │ │ frontend/) │ │ main.py) │ └──────────────┘ └────────┬─────────┘ │ ┌─────▼─────┐ │ LangGraph │ │ Agent │ └─────┬─────┘ │ ┌───────────────┼───────────────┐ │ │ │ ▼ ▼ ▼ ┌──────────┐ ┌──────────────┐ ┌──────────┐ │DuckDB │ │Pydantic │ │LLM Client│ │(Parquet) │ │Tools Schemas │ │(3 backends) └──────────┘ └──────────────┘ └──────────┘ ``` --- ## Common Issues & Solutions ### ModuleNotFoundError: No module named 'agent' **Solution:** Make sure you're in the repo root and have run `uv sync`. ```bash cd /home/ysh/dev/why-agent uv sync ``` ### Tests fail with "No MINIMAX_API_KEY" **Solution:** Use `MODEL_BACKEND=replay` for local testing. Replay mode doesn't call any LLM. ```bash export MODEL_BACKEND=replay uv run pytest ``` ### Ruff formatting conflicts with editor **Solution:** Use the commands above — they're the source of truth. ```bash uv run ruff format uv run ruff check --fix ``` ### Next.js frontend doesn't build **Solution:** Make sure Node 20+ is installed and `npm install` ran successfully. ```bash node --version # should be v20+ cd client/frontend npm install npm run build ``` --- ## Coding Conventions Per `CLAUDE.md`, follow these conventions: 1. **Sync by default** — DuckDB has no async API. Use `async def` only at the LLM boundary. 2. **Pydantic v2** — All structured data (tool inputs/outputs, state, semantic layer). 3. **Type annotations** — Required on public functions (args and return type). 4. **No print()** — Use `logger = logging.getLogger(__name__)` in agent code. 5. **No magic strings** — Backend names, tool names, scenario IDs go in `agent/constants.py`. 6. **Tool docstrings for the LLM** — Write them as if the model will read them. Example tool: ```python from pydantic import BaseModel, Field import logging logger = logging.getLogger(__name__) class MyToolInput(BaseModel): query: str = Field(description="A human-readable query.") def my_tool(args: MyToolInput) -> dict: """Use this tool to do X. Returns a dict with 'result' and optional 'error'.""" try: result = ... return {"result": result} except Exception as exc: logger.exception("Failed") return {"error": str(exc), "hint": "Try Y instead"} ``` --- ## Deployment ### Local Docker build To test the full stack locally (frontend + backend + agent) in a container: ```bash docker build -t why-agent:latest . docker run -p 7860:7860 -e MODEL_BACKEND=replay why-agent:latest ``` Then open `http://localhost:7860`. ### Remote push rules The repo has two git remotes with different push policies: | Remote | Purpose | When to push | |--------|---------|-------------| | `origin` (GitHub) | Source of truth, PRs, CI | Every commit — always push here | | `space` (HF Spaces) | Deployment target | **Only when opening a PR** | ```bash # Normal dev — push to GitHub only git push origin feat/my-feature # Deploy to HF Spaces — only when PR is ready git push space feat/my-feature:main --force ``` HF Spaces triggers a full Docker rebuild on every push. **Do not push to `space` during iteration** — only when the branch is ready for demo/review and a PR is being opened. ### HF Spaces environment variables When deploying to HF Spaces, set these secrets in the Space settings: | Variable | Value | Purpose | |----------|-------|---------| | `MODEL_BACKEND` | `replay` or `minimax` | LLM backend; use `replay` to avoid API costs | | `MINIMAX_API_KEY` | (API key) | Required only if `MODEL_BACKEND=minimax` | | `HF_DATASET_ID` | (optional) | Dataset repo ID to auto-download Parquet files on boot | | `PARQUET_DIR` | `/app/data/parquet` | Path inside container (do not change) | | `SEMANTIC_LAYER_PATH` | `/app/data/semantic_layer.yml` | Path inside container (do not change) | **Note:** Paths in the container must use `/app/` prefix, not relative paths. ### HF Spaces deployment procedure #### Quick start 1. **Create a new Space** on [huggingface.co/spaces](https://huggingface.co/spaces): - Owner: your username - Space name: `why-agent` (or any name) - License: MIT - Docker template (or blank) 2. **Link the repo**: ```bash cd /path/to/why-agent git remote add space https://huggingface.co/spaces/{username}/{space-name} ``` 3. **Push to deploy** (only when ready): ```bash git push space feat/my-feature:main --force ``` 4. **Set secrets** in the Space UI → Settings → Repository secrets: - `MINIMAX_API_KEY` (if using MiniMax backend) - `HF_DATASET_ID` (optional; see below) #### How the build works 1. HF Spaces detects the `Dockerfile` in the repo root 2. Builds the image (takes ~5–10 minutes the first time) 3. Runs the container on port 7860 4. The `entrypoint.sh` script starts nginx, backend, and frontend via supervisord #### Auto-downloading Parquet data If you set `HF_DATASET_ID=ysh99226/why-agent-data`, the entrypoint will: 1. Check if `/app/data/parquet` is empty 2. Run `hf download` to fetch the dataset 3. Timeout after 120 seconds and fall back to `MODEL_BACKEND=replay` The `hf` command (from `huggingface-hub` package) replaces the deprecated `huggingface-cli`. #### Git workflow for deployment **Do NOT push to HF Spaces during development.** 1. **Work on a feature branch:** ```bash git checkout -b feat/my-feature git push origin feat/my-feature ``` 2. **Open a PR on GitHub** when ready. 3. **Deploy to HF Spaces only when the PR is ready to demo:** ```bash git push space main:main --force ``` Or, if the feature branch is the one being demoed (before merge): ```bash git push space feat/my-feature:main --force ``` **Why `--force`?** HF Spaces doesn't have a traditional git history. Using `--force` ensures the Space always reflects the exact commit you push, even if the branch history differs from the origin. --- ## Docker build errors & fixes ### "replays/ directory not found" or "missing JSON files" **Cause:** The Dockerfile expects `replays/` to exist and contain at least one `.json` file for `MODEL_BACKEND=replay` to work. **Fix:** ```bash # Create dummy replay if needed mkdir -p replays echo '{"scenario": "demo"}' > replays/demo.json git add replays/demo.json git commit -m "chore: add demo replay" ``` Then rebuild the Docker image. ### "SEMANTIC_LAYER_PATH not found" or "semantic_layer.yml missing" **Cause:** The Dockerfile copies `data/semantic_layer_6w.yml` but the file doesn't exist. **Fix:** ```bash # Check the actual filename ls -la data/semantic_layer* # If using a different name, update the Dockerfile COPY line COPY data/semantic_layer_6w.yml /app/data/semantic_layer.yml ``` Or, if you're using a different semantic layer file: ```dockerfile COPY data/YOUR_SEMANTIC_LAYER.yml /app/data/semantic_layer.yml ``` ### "supervisord can't find environment variables" or "MODEL_BACKEND not set in child processes" **Cause:** Environment variables set in `ENV` commands are not automatically passed to supervisord child processes. **Fix:** The `docker/supervisord.conf` must explicitly read env vars via `environment=` lines: ```ini [program:backend] command=/app/.venv/bin/uvicorn ... environment=PYTHONUNBUFFERED="1",MODEL_BACKEND="replay" ``` Or pass them in the command itself. Rebuild the image after fixing `supervisord.conf`. ### "huggingface-cli: command not found" **Cause:** The old `huggingface-cli` tool is deprecated. The project uses the newer `hf` command from `huggingface-hub` package. **Fix:** The Dockerfile includes `huggingface-hub` in `pyproject.toml`. The `entrypoint.sh` script uses `hf download`, which is the correct command. If the entrypoint still fails: ```bash # Verify hf is installed docker run -it why-agent:latest /app/.venv/bin/hf --version # If missing, add to pyproject.toml uv add huggingface-hub ``` ### "next: command not found" or "Node.js frontend doesn't start" **Cause:** The Next.js build failed, or the `server.js` file is missing. **Fix:** 1. Check the build log for `npm run build` errors 2. Ensure `client/frontend/package.json` exists and has a valid build script 3. Rebuild the Docker image: ```bash docker build --no-cache -t why-agent:latest . ``` ### "nginx bind: address already in use" **Cause:** Port 7860 or 80 is already bound on your machine. **Fix (local testing):** ```bash docker run -p 8080:7860 -e MODEL_BACKEND=replay why-agent:latest # Now visit http://localhost:8080 ``` On HF Spaces, port 7860 is reserved and managed by the platform — no action needed. ### "ModuleNotFoundError: No module named 'agent'" **Cause:** The Python path is not set correctly in the container. **Fix:** The Dockerfile sets `ENV PYTHONPATH=/app`, which should work. If it doesn't: 1. Verify `COPY agent/ /app/agent/` in the Dockerfile 2. Check that the `backend` program in supervisord uses the full venv path: `/app/.venv/bin/uvicorn` ### "API route returns 404" or "Frontend can't reach backend" **Cause:** nginx is not configured to reverse-proxy to the backend on 127.0.0.1:8000. **Fix:** Check `docker/nginx.conf`: ```nginx location /api/ { proxy_pass http://127.0.0.1:8000; proxy_set_header X-Real-IP $remote_addr; ... } ``` Rebuild after fixing the config: ```bash docker build --no-cache -t why-agent:latest . ``` --- ## Health check & monitoring ### Verify all services are running ```bash # Inside the container or from host curl http://localhost:7860/api/health # Expected: {"ok":true} curl http://localhost:7860/ # Expected: HTML (Next.js frontend) curl -X POST http://localhost:7860/api/investigate \ -H "Content-Type: application/json" \ -d '{"question":"Why did revenue go up?"}' # Expected: Server-Sent Event stream ``` ### Check logs in HF Spaces Click "Logs" in the top right of the Space UI. The logs show: - nginx startup - backend startup (uvicorn) - frontend startup (Node.js) - Any errors from the agent or tools ### Common troubleshooting flows **The frontend loads but the backend is down:** 1. Check Space logs (UI → Logs) 2. Verify `PYTHONPATH=/app` is set in the Dockerfile 3. Verify `supervisord.conf` has the correct backend command 4. Rebuild without cache and push: ```bash git push space feat/my-feature:main --force ``` **The API returns 500 errors but logs show nothing:** 1. The agent code may have an unhandled exception 2. Check the agent's error handling in `agent/graph.py` 3. Verify the semantic layer file exists at `/app/data/semantic_layer.yml` 4. Test locally: ```bash docker run -e MODEL_BACKEND=replay why-agent:latest curl http://localhost:7860/api/health ``` **Parquet data auto-download timed out, but I want to retry:** The entrypoint waits 120 seconds for the HF dataset download, then falls back to `MODEL_BACKEND=replay`. If you want a fresh download: 1. Manually clear the parquet directory in the Space (if you have SSH access) 2. Or, restart the Space (UI → Settings → Restart) 3. The entrypoint will retry on next boot **I pushed to the Space but the changes didn't appear:** 1. Verify you pushed to the correct branch (should push `*:main`): ```bash git push space feat/my-feature:main --force ``` 2. HF Spaces can take 5–10 minutes to rebuild. Wait and refresh after 2 minutes. 3. If the Space still doesn't update: - Click "Restart" in the Space UI - Or delete and recreate the Space --- ## Reporting Issues If you find a bug or have a feature request: 1. Check existing issues in GitHub 2. Provide a minimal reproduction (code snippet + data) 3. Include your environment (Python version, OS, backend) --- ## Getting Help - **CLAUDE.md** — Implementation decisions and locked constraints - **README.md** — Business context and architecture - **Agent code** — Read `agent/graph.py` to understand the loop; read `agent/tools/` to see tool contracts - **LangGraph docs** — https://langchain-ai.github.io/langgraph/ - **Pydantic docs** — https://docs.pydantic.dev/ --- Last updated: 2026-05-07