Spaces:
Sleeping
Sleeping
| # Week 1: Foundation & Setup — Detailed Documentation | |
| > **Goal:** Project skeleton running locally, all external services provisioned. | |
| > **Status:** Complete | |
| > **Date:** 2026-03-19 | |
| --- | |
| ## What We Accomplished | |
| Week 1 established the entire project foundation: directory structure, configuration system, | |
| data models, external service accounts, CI/CD pipeline, and the initial deployment config. | |
| --- | |
| ## Step-by-Step Log | |
| ### Step 1: Initialize the Project | |
| **What we did:** Created the project directory structure following a modular Python backend | |
| architecture with clear separation of concerns. | |
| **Why this structure matters:** | |
| ``` | |
| app/ ← All backend application code lives here | |
| agents/ ← One file per agent (security, performance, style, synthesizer) | |
| tools/ ← LangChain tool wrappers (semgrep, bandit, radon, etc.) | |
| context/ ← RAG pipeline (embedder → indexer → retriever) | |
| github/ ← All GitHub API interaction (webhook, auth, client, formatter) | |
| models/ ← Pydantic data models (Finding, PRReview, webhook payloads) | |
| db/ ← Database & cache (Postgres, Redis) | |
| services/ ← Business logic (orchestrator, health score calculator) | |
| dashboard/ ← Next.js frontend (deployed separately to Vercel) | |
| tests/ ← Mirrors the app/ structure (unit/, integration/, eval/) | |
| prompts/ ← Agent system prompts as Markdown files | |
| knowledge/ ← RAG knowledge bases (OWASP, DDIA, style guides) | |
| docs/ ← Project documentation (this file) | |
| ``` | |
| **Key principle:** Each directory has a single responsibility. The `agents/` folder doesn't | |
| know about GitHub. The `github/` folder doesn't know about LangChain. The `services/` | |
| folder orchestrates between them. This is called **separation of concerns** — it makes the | |
| code testable, maintainable, and easy to explain in interviews. | |
| **Commands run:** | |
| ```bash | |
| # Create all directories | |
| mkdir -p app/{agents,tools,context,github,models,db,services} | |
| mkdir -p dashboard/{app/{repos,api},components,lib} | |
| mkdir -p tests/{unit,integration,eval/dataset} | |
| mkdir -p prompts knowledge/style_guides | |
| # Create __init__.py files (makes directories Python packages) | |
| touch app/__init__.py app/agents/__init__.py app/tools/__init__.py ... | |
| # Initialize git | |
| git init && git branch -m main | |
| ``` | |
| ### Step 2: Create Configuration System (app/config.py) | |
| **What we did:** Created a centralized configuration file using `pydantic-settings`. | |
| **How it works:** | |
| ```python | |
| from pydantic_settings import BaseSettings | |
| class Settings(BaseSettings): | |
| groq_api_key: str = "" | |
| github_app_id: str = "" | |
| # ... all config vars | |
| model_config = {"env_file": ".env"} | |
| settings = Settings() # Singleton — imported everywhere | |
| ``` | |
| **Why pydantic-settings instead of plain os.environ?** | |
| 1. **Type safety** — `confidence_threshold: float = 0.6` ensures it's a float, not a string | |
| 2. **Validation** — pydantic raises clear errors if required vars are missing | |
| 3. **Defaults** — each setting has a sensible default for development | |
| 4. **Auto-loads .env** — reads from `.env` file automatically (via `model_config`) | |
| 5. **IDE autocomplete** — `settings.groq_api_key` instead of `os.environ.get("GROQ_API_KEY")` | |
| **Interview talking point:** "We use pydantic-settings for type-safe configuration management | |
| following the 12-factor app methodology — config lives in environment variables, not in code. | |
| This makes the same codebase work in development, staging, and production with zero code changes." | |
| ### Step 3: Define Data Models (app/models/findings.py) | |
| **What we did:** Created Pydantic models that define the exact shape of data flowing through | |
| the system. | |
| **Three core models:** | |
| #### Finding — Output of each domain agent | |
| ```python | |
| class Finding(BaseModel): | |
| agent: Literal["security", "performance", "style"] # Which agent found this | |
| file_path: str # e.g. "src/auth/login.py" | |
| line_start: int # Where the issue starts | |
| line_end: int # Where the issue ends | |
| severity: Literal["critical", "high", "medium", "low"] # How bad is it | |
| category: str # e.g. "sql_injection", "n+1_query" | |
| title: str # One-liner for the inline comment header | |
| description: str # Full explanation | |
| suggested_fix: str # Corrected code snippet | |
| cwe_id: Optional[str] # CWE ID for security findings (e.g. "CWE-89") | |
| confidence: float # 0.0–1.0, how sure the agent is | |
| ``` | |
| #### SynthesizedReview — Output of the Synthesizer Agent | |
| ```python | |
| class SynthesizedReview(BaseModel): | |
| health_score: int # 0-100 (the headline metric) | |
| executive_summary: str # 3-5 sentences for PR description | |
| recommendation: Literal["approve", "request_changes", "block"] | |
| findings: list[Finding] # Deduplicated, re-ranked findings | |
| critical_count: int # Counts by severity | |
| # ... | |
| ``` | |
| #### PRReviewRecord — What gets stored in Postgres | |
| ```python | |
| class PRReviewRecord(BaseModel): | |
| id: UUID # Primary key | |
| repo_full_name: str # "ninjacode911/myapp" | |
| pr_number: int | |
| commit_sha: str | |
| health_score: int | |
| findings: list[Finding] # Full findings as JSONB | |
| duration_ms: int # How long the review took | |
| ``` | |
| **Why Pydantic models instead of plain dicts?** | |
| 1. **Validation** — `severity: Literal["critical", "high", "medium", "low"]` rejects invalid values | |
| 2. **Serialization** — `.model_dump()` converts to dict, `.model_dump_json()` to JSON | |
| 3. **Documentation** — the schema IS the documentation | |
| 4. **Type checking** — mypy catches bugs at development time, not production | |
| **Interview talking point:** "Every data boundary in the system uses Pydantic models — agent | |
| outputs, API responses, database records. This gives us runtime validation, IDE autocomplete, | |
| and auto-generated OpenAPI docs. If an agent returns malformed JSON, Pydantic catches it | |
| immediately instead of letting bad data propagate through the pipeline." | |
| ### Step 4: Define Webhook Payload Models (app/models/webhook_payloads.py) | |
| **What we did:** Created typed models for GitHub's webhook JSON payloads. | |
| **Why type the webhook payload?** | |
| GitHub sends complex nested JSON. Without types, you'd write: | |
| ```python | |
| sha = payload["pull_request"]["head"]["sha"] # Easy to typo, no autocomplete | |
| ``` | |
| With Pydantic models: | |
| ```python | |
| event = PullRequestEvent(**payload) | |
| sha = event.pull_request.head.sha # Autocomplete, type-checked | |
| ``` | |
| We didn't use these models in the final webhook handler (we used raw dict access for | |
| simplicity), but they're available for stricter validation later. | |
| ### Step 5: Create FastAPI Skeleton (app/main.py) | |
| **What we did:** Created the FastAPI application with a `/health` endpoint. | |
| ```python | |
| app = FastAPI(title="Ninja Code Guard", version="0.1.0") | |
| @app.get("/health") | |
| async def health_check(): | |
| return {"status": "ok", "service": "Ninja Code Guard", "version": "0.1.0"} | |
| ``` | |
| **Why a /health endpoint?** | |
| - **Render.com** uses it to know if your service is alive (configured in render.yaml) | |
| - **GitHub Actions cron** pings it every 10 minutes to prevent cold starts | |
| - **The dashboard** calls it to show service status | |
| - **Load balancers** (if you scale up) use it to route traffic only to healthy instances | |
| ### Step 6: Provision External Services | |
| **What we did:** Created accounts and obtained credentials for all external services. | |
| #### 6a. GitHub App — "Ninja's Code Guard" | |
| **Where:** github.com/settings/apps/new | |
| **What we configured:** | |
| | Setting | Value | Reason | | |
| |---------|-------|--------| | |
| | Name | Ninja Code Guard | Bot identity: `ninjas-code-guard[bot]` | | |
| | Homepage URL | github.com/ninjacode911/codeprobe | Points to our repo | | |
| | Webhook Active | Yes | We need to receive PR events | | |
| | Webhook Secret | (generated with `python -c "import secrets; print(secrets.token_hex(32))"`) | HMAC authentication | | |
| | Contents | Read | Fetch full file source code for RAG context | | |
| | Pull requests | Read & Write | Read diffs, post review comments | | |
| | Commit statuses | Write | Show health score as commit status check | | |
| | Metadata | Read | Required — basic repo info | | |
| | Events | pull_request, pull_request_review_comment | Our trigger events | | |
| | Install target | Only this account | Dev-mode only for now | | |
| **What we got:** | |
| - App ID: 3133457 | |
| - Private Key: `.pem` file saved to `keys/ninja-s-code-guard.2026-03-19.private-key.pem` | |
| - Webhook Secret: saved to `.env` | |
| **How GitHub App authentication works (important concept):** | |
| ``` | |
| Step 1: Sign a JWT with our private key (.pem) | |
| JWT payload = {iss: APP_ID, iat: now, exp: now+9min} | |
| Signed with RS256 (RSA + SHA-256) | |
| This proves: "I am the Ninja Code Guard app" | |
| Step 2: Exchange JWT for an installation access token | |
| POST /app/installations/{id}/access_tokens | |
| Headers: Authorization: Bearer <JWT> | |
| Returns: token valid for 1 hour, scoped to installed repos | |
| This proves: "I can access ninjacode911's repos" | |
| Step 3: Use installation token for all API calls | |
| GET /repos/ninjacode911/codeguard-test/pulls/1 | |
| Headers: Authorization: token <installation_token> | |
| ``` | |
| #### 6b. Groq API | |
| **Where:** console.groq.com | |
| **What:** API key for Llama-3.1-70B inference (14,400 free requests/day) | |
| **Saved as:** `GROQ_API_KEY` in `.env` | |
| #### 6c. Neon.tech Postgres | |
| **Where:** console.neon.tech | |
| **What:** Serverless Postgres database (512MB free tier) | |
| **Saved as:** `DATABASE_URL` in `.env` | |
| **Used for:** Storing PR review history, health score trends, finding details | |
| #### 6d. Upstash Redis | |
| **Where:** console.upstash.com | |
| **What:** Serverless Redis (10K requests/day free tier) | |
| **Saved as:** `UPSTASH_REDIS_URL` in `.env` | |
| **Used for:** Caching reviewed commit SHAs to prevent duplicate analysis | |
| ### Step 7: Create Configuration Files | |
| #### .env.example | |
| Template showing all required environment variables without actual values. | |
| Committed to git so new developers know what to configure. | |
| #### .gitignore | |
| Prevents sensitive files from being committed: | |
| - `.env` (contains API keys) | |
| - `keys/` (contains private key .pem) | |
| - `__pycache__/`, `.venv/` (generated files) | |
| - `chroma_data/` (vector store data) | |
| - `dashboard/node_modules/`, `dashboard/.next/` (Node.js generated) | |
| #### pyproject.toml | |
| Project metadata + tool configuration: | |
| - `[tool.ruff]` — Python linter settings | |
| - `[tool.pytest]` — Test configuration (asyncio mode, test paths) | |
| - `[tool.mypy]` — Type checker settings | |
| #### render.yaml | |
| Render.com deployment configuration: | |
| ```yaml | |
| services: | |
| - type: web | |
| name: ninja-code-guard | |
| buildCommand: pip install -r requirements.txt | |
| startCommand: uvicorn app.main:app --host 0.0.0.0 --port $PORT | |
| healthCheckPath: /health | |
| plan: free | |
| ``` | |
| #### sentinel.yml.example | |
| Per-repo configuration template that users place in their repo root: | |
| ```yaml | |
| agents: | |
| security: true | |
| performance: true | |
| style: true | |
| min_severity: low | |
| min_confidence: 0.6 | |
| exclude: | |
| - "vendor/" | |
| - "node_modules/" | |
| ``` | |
| ### Step 8: Set Up CI/CD (GitHub Actions) | |
| **Created two workflows:** | |
| #### ci.yml — Runs on every push/PR | |
| ```yaml | |
| steps: | |
| - Lint with ruff (catches style/import issues) | |
| - Type check with mypy (catches type errors) | |
| - Run tests with pytest | |
| ``` | |
| #### prewarm.yml — Cron job every 10 minutes on weekdays | |
| ```yaml | |
| schedule: "*/10 6-20 * * 1-5" # Every 10min, 6am-8pm UTC, Mon-Fri | |
| steps: | |
| - curl the /health endpoint to prevent Render cold starts | |
| ``` | |
| **Why pre-warm?** Render's free tier spins down after 15 minutes of inactivity. The first | |
| request after spindown takes ~30 seconds (cold start). By pinging /health every 10 minutes | |
| during working hours, the service stays warm and responds instantly to webhooks. | |
| ### Step 9: Write Initial Tests | |
| **Created:** `tests/unit/test_findings_schema.py` — 8 tests for data model validation | |
| These tests verify: | |
| - Valid Finding objects are accepted | |
| - Invalid agent types are rejected | |
| - Invalid severity levels are rejected | |
| - Confidence must be between 0.0 and 1.0 | |
| - CWE ID is optional (None allowed) | |
| - Health score must be 0-100 | |
| - Invalid recommendation values are rejected | |
| --- | |
| ## Files Created in Week 1 | |
| | File | Purpose | | |
| |------|---------| | |
| | `app/__init__.py` | Makes app a Python package | | |
| | `app/config.py` | Centralized configuration via environment variables | | |
| | `app/main.py` | FastAPI app with /health endpoint (expanded in Week 2) | | |
| | `app/models/__init__.py` | Models package | | |
| | `app/models/findings.py` | Finding, SynthesizedReview, PRReviewRecord schemas | | |
| | `app/models/webhook_payloads.py` | GitHub webhook event payload types | | |
| | `tests/conftest.py` | Shared test fixtures (sample finding data) | | |
| | `tests/unit/test_findings_schema.py` | 8 schema validation tests | | |
| | `.env` | Environment variables (gitignored — contains secrets) | | |
| | `.env.example` | Template for .env (committed — no secrets) | | |
| | `.gitignore` | Files to exclude from git | | |
| | `pyproject.toml` | Project metadata + tool configs | | |
| | `requirements.txt` | Python production dependencies | | |
| | `requirements-dev.txt` | Dev/test dependencies | | |
| | `render.yaml` | Render.com deployment config | | |
| | `sentinel.yml.example` | Per-repo config template | | |
| | `.github/workflows/ci.yml` | CI pipeline (lint + test) | | |
| | `.github/workflows/prewarm.yml` | Render pre-warm cron | | |
| | `keys/.gitignore` | Prevents .pem files from being committed | | |
| | `PROJECT_PLAN.md` | Master project plan + progress tracker | | |
| --- | |
| ## Key Decisions Made | |
| | Decision | Rationale | | |
| |----------|-----------| | |
| | Pydantic for all data models | Runtime validation + IDE autocomplete + auto-docs | | |
| | pydantic-settings for config | Type-safe env vars, auto-loads .env, 12-factor pattern | | |
| | FastAPI (not Flask/Django) | Async-native (needed for parallel agents), auto OpenAPI docs, modern Python | | |
| | GitHub App (not Action) | One deployment serves all repos, webhook-driven, own bot identity | | |
| | Upstash Redis (not in-memory cache) | Persists across Render restarts, shared across workers | | |
| | Neon.tech (not SQLite) | Serverless, accessible from dashboard, persistent storage | | |
| --- | |
| *Documentation written 2026-03-19 as part of Week 1 completion.* | |