ninja-code-guard / docs /WEEK1_FOUNDATION_AND_SETUP.md
NinjainPJs's picture
initial - commit
4b445f6

Week 1: Foundation & Setup β€” Detailed Documentation

Goal: Project skeleton running locally, all external services provisioned. Status: Complete Date: 2026-03-19


What We Accomplished

Week 1 established the entire project foundation: directory structure, configuration system, data models, external service accounts, CI/CD pipeline, and the initial deployment config.


Step-by-Step Log

Step 1: Initialize the Project

What we did: Created the project directory structure following a modular Python backend architecture with clear separation of concerns.

Why this structure matters:

app/                    ← All backend application code lives here
  agents/               ← One file per agent (security, performance, style, synthesizer)
  tools/                ← LangChain tool wrappers (semgrep, bandit, radon, etc.)
  context/              ← RAG pipeline (embedder β†’ indexer β†’ retriever)
  github/               ← All GitHub API interaction (webhook, auth, client, formatter)
  models/               ← Pydantic data models (Finding, PRReview, webhook payloads)
  db/                   ← Database & cache (Postgres, Redis)
  services/             ← Business logic (orchestrator, health score calculator)
dashboard/              ← Next.js frontend (deployed separately to Vercel)
tests/                  ← Mirrors the app/ structure (unit/, integration/, eval/)
prompts/                ← Agent system prompts as Markdown files
knowledge/              ← RAG knowledge bases (OWASP, DDIA, style guides)
docs/                   ← Project documentation (this file)

Key principle: Each directory has a single responsibility. The agents/ folder doesn't know about GitHub. The github/ folder doesn't know about LangChain. The services/ folder orchestrates between them. This is called separation of concerns β€” it makes the code testable, maintainable, and easy to explain in interviews.

Commands run:

# Create all directories
mkdir -p app/{agents,tools,context,github,models,db,services}
mkdir -p dashboard/{app/{repos,api},components,lib}
mkdir -p tests/{unit,integration,eval/dataset}
mkdir -p prompts knowledge/style_guides

# Create __init__.py files (makes directories Python packages)
touch app/__init__.py app/agents/__init__.py app/tools/__init__.py ...

# Initialize git
git init && git branch -m main

Step 2: Create Configuration System (app/config.py)

What we did: Created a centralized configuration file using pydantic-settings.

How it works:

from pydantic_settings import BaseSettings

class Settings(BaseSettings):
    groq_api_key: str = ""
    github_app_id: str = ""
    # ... all config vars

    model_config = {"env_file": ".env"}

settings = Settings()  # Singleton β€” imported everywhere

Why pydantic-settings instead of plain os.environ?

  1. Type safety β€” confidence_threshold: float = 0.6 ensures it's a float, not a string
  2. Validation β€” pydantic raises clear errors if required vars are missing
  3. Defaults β€” each setting has a sensible default for development
  4. Auto-loads .env β€” reads from .env file automatically (via model_config)
  5. IDE autocomplete β€” settings.groq_api_key instead of os.environ.get("GROQ_API_KEY")

Interview talking point: "We use pydantic-settings for type-safe configuration management following the 12-factor app methodology β€” config lives in environment variables, not in code. This makes the same codebase work in development, staging, and production with zero code changes."

Step 3: Define Data Models (app/models/findings.py)

What we did: Created Pydantic models that define the exact shape of data flowing through the system.

Three core models:

Finding β€” Output of each domain agent

class Finding(BaseModel):
    agent: Literal["security", "performance", "style"]  # Which agent found this
    file_path: str              # e.g. "src/auth/login.py"
    line_start: int             # Where the issue starts
    line_end: int               # Where the issue ends
    severity: Literal["critical", "high", "medium", "low"]  # How bad is it
    category: str               # e.g. "sql_injection", "n+1_query"
    title: str                  # One-liner for the inline comment header
    description: str            # Full explanation
    suggested_fix: str          # Corrected code snippet
    cwe_id: Optional[str]       # CWE ID for security findings (e.g. "CWE-89")
    confidence: float           # 0.0–1.0, how sure the agent is

SynthesizedReview β€” Output of the Synthesizer Agent

class SynthesizedReview(BaseModel):
    health_score: int           # 0-100 (the headline metric)
    executive_summary: str      # 3-5 sentences for PR description
    recommendation: Literal["approve", "request_changes", "block"]
    findings: list[Finding]     # Deduplicated, re-ranked findings
    critical_count: int         # Counts by severity
    # ...

PRReviewRecord β€” What gets stored in Postgres

class PRReviewRecord(BaseModel):
    id: UUID                    # Primary key
    repo_full_name: str         # "ninjacode911/myapp"
    pr_number: int
    commit_sha: str
    health_score: int
    findings: list[Finding]     # Full findings as JSONB
    duration_ms: int            # How long the review took

Why Pydantic models instead of plain dicts?

  1. Validation β€” severity: Literal["critical", "high", "medium", "low"] rejects invalid values
  2. Serialization β€” .model_dump() converts to dict, .model_dump_json() to JSON
  3. Documentation β€” the schema IS the documentation
  4. Type checking β€” mypy catches bugs at development time, not production

Interview talking point: "Every data boundary in the system uses Pydantic models β€” agent outputs, API responses, database records. This gives us runtime validation, IDE autocomplete, and auto-generated OpenAPI docs. If an agent returns malformed JSON, Pydantic catches it immediately instead of letting bad data propagate through the pipeline."

Step 4: Define Webhook Payload Models (app/models/webhook_payloads.py)

What we did: Created typed models for GitHub's webhook JSON payloads.

Why type the webhook payload? GitHub sends complex nested JSON. Without types, you'd write:

sha = payload["pull_request"]["head"]["sha"]  # Easy to typo, no autocomplete

With Pydantic models:

event = PullRequestEvent(**payload)
sha = event.pull_request.head.sha  # Autocomplete, type-checked

We didn't use these models in the final webhook handler (we used raw dict access for simplicity), but they're available for stricter validation later.

Step 5: Create FastAPI Skeleton (app/main.py)

What we did: Created the FastAPI application with a /health endpoint.

app = FastAPI(title="Ninja Code Guard", version="0.1.0")

@app.get("/health")
async def health_check():
    return {"status": "ok", "service": "Ninja Code Guard", "version": "0.1.0"}

Why a /health endpoint?

  • Render.com uses it to know if your service is alive (configured in render.yaml)
  • GitHub Actions cron pings it every 10 minutes to prevent cold starts
  • The dashboard calls it to show service status
  • Load balancers (if you scale up) use it to route traffic only to healthy instances

Step 6: Provision External Services

What we did: Created accounts and obtained credentials for all external services.

6a. GitHub App β€” "Ninja's Code Guard"

Where: github.com/settings/apps/new

What we configured:

Setting Value Reason
Name Ninja Code Guard Bot identity: ninjas-code-guard[bot]
Homepage URL github.com/ninjacode911/codeprobe Points to our repo
Webhook Active Yes We need to receive PR events
Webhook Secret (generated with python -c "import secrets; print(secrets.token_hex(32))") HMAC authentication
Contents Read Fetch full file source code for RAG context
Pull requests Read & Write Read diffs, post review comments
Commit statuses Write Show health score as commit status check
Metadata Read Required β€” basic repo info
Events pull_request, pull_request_review_comment Our trigger events
Install target Only this account Dev-mode only for now

What we got:

  • App ID: 3133457
  • Private Key: .pem file saved to keys/ninja-s-code-guard.2026-03-19.private-key.pem
  • Webhook Secret: saved to .env

How GitHub App authentication works (important concept):

Step 1: Sign a JWT with our private key (.pem)
        JWT payload = {iss: APP_ID, iat: now, exp: now+9min}
        Signed with RS256 (RSA + SHA-256)
        This proves: "I am the Ninja Code Guard app"

Step 2: Exchange JWT for an installation access token
        POST /app/installations/{id}/access_tokens
        Headers: Authorization: Bearer <JWT>
        Returns: token valid for 1 hour, scoped to installed repos
        This proves: "I can access ninjacode911's repos"

Step 3: Use installation token for all API calls
        GET /repos/ninjacode911/codeguard-test/pulls/1
        Headers: Authorization: token <installation_token>

6b. Groq API

Where: console.groq.com What: API key for Llama-3.1-70B inference (14,400 free requests/day) Saved as: GROQ_API_KEY in .env

6c. Neon.tech Postgres

Where: console.neon.tech What: Serverless Postgres database (512MB free tier) Saved as: DATABASE_URL in .env Used for: Storing PR review history, health score trends, finding details

6d. Upstash Redis

Where: console.upstash.com What: Serverless Redis (10K requests/day free tier) Saved as: UPSTASH_REDIS_URL in .env Used for: Caching reviewed commit SHAs to prevent duplicate analysis

Step 7: Create Configuration Files

.env.example

Template showing all required environment variables without actual values. Committed to git so new developers know what to configure.

.gitignore

Prevents sensitive files from being committed:

  • .env (contains API keys)
  • keys/ (contains private key .pem)
  • __pycache__/, .venv/ (generated files)
  • chroma_data/ (vector store data)
  • dashboard/node_modules/, dashboard/.next/ (Node.js generated)

pyproject.toml

Project metadata + tool configuration:

  • [tool.ruff] β€” Python linter settings
  • [tool.pytest] β€” Test configuration (asyncio mode, test paths)
  • [tool.mypy] β€” Type checker settings

render.yaml

Render.com deployment configuration:

services:
  - type: web
    name: ninja-code-guard
    buildCommand: pip install -r requirements.txt
    startCommand: uvicorn app.main:app --host 0.0.0.0 --port $PORT
    healthCheckPath: /health
    plan: free

sentinel.yml.example

Per-repo configuration template that users place in their repo root:

agents:
  security: true
  performance: true
  style: true
min_severity: low
min_confidence: 0.6
exclude:
  - "vendor/"
  - "node_modules/"

Step 8: Set Up CI/CD (GitHub Actions)

Created two workflows:

ci.yml β€” Runs on every push/PR

steps:
  - Lint with ruff (catches style/import issues)
  - Type check with mypy (catches type errors)
  - Run tests with pytest

prewarm.yml β€” Cron job every 10 minutes on weekdays

schedule: "*/10 6-20 * * 1-5"  # Every 10min, 6am-8pm UTC, Mon-Fri
steps:
  - curl the /health endpoint to prevent Render cold starts

Why pre-warm? Render's free tier spins down after 15 minutes of inactivity. The first request after spindown takes ~30 seconds (cold start). By pinging /health every 10 minutes during working hours, the service stays warm and responds instantly to webhooks.

Step 9: Write Initial Tests

Created: tests/unit/test_findings_schema.py β€” 8 tests for data model validation

These tests verify:

  • Valid Finding objects are accepted
  • Invalid agent types are rejected
  • Invalid severity levels are rejected
  • Confidence must be between 0.0 and 1.0
  • CWE ID is optional (None allowed)
  • Health score must be 0-100
  • Invalid recommendation values are rejected

Files Created in Week 1

File Purpose
app/__init__.py Makes app a Python package
app/config.py Centralized configuration via environment variables
app/main.py FastAPI app with /health endpoint (expanded in Week 2)
app/models/__init__.py Models package
app/models/findings.py Finding, SynthesizedReview, PRReviewRecord schemas
app/models/webhook_payloads.py GitHub webhook event payload types
tests/conftest.py Shared test fixtures (sample finding data)
tests/unit/test_findings_schema.py 8 schema validation tests
.env Environment variables (gitignored β€” contains secrets)
.env.example Template for .env (committed β€” no secrets)
.gitignore Files to exclude from git
pyproject.toml Project metadata + tool configs
requirements.txt Python production dependencies
requirements-dev.txt Dev/test dependencies
render.yaml Render.com deployment config
sentinel.yml.example Per-repo config template
.github/workflows/ci.yml CI pipeline (lint + test)
.github/workflows/prewarm.yml Render pre-warm cron
keys/.gitignore Prevents .pem files from being committed
PROJECT_PLAN.md Master project plan + progress tracker

Key Decisions Made

Decision Rationale
Pydantic for all data models Runtime validation + IDE autocomplete + auto-docs
pydantic-settings for config Type-safe env vars, auto-loads .env, 12-factor pattern
FastAPI (not Flask/Django) Async-native (needed for parallel agents), auto OpenAPI docs, modern Python
GitHub App (not Action) One deployment serves all repos, webhook-driven, own bot identity
Upstash Redis (not in-memory cache) Persists across Render restarts, shared across workers
Neon.tech (not SQLite) Serverless, accessible from dashboard, persistent storage

Documentation written 2026-03-19 as part of Week 1 completion.