Spaces:

NinjainPJs
/

ninja-code-guard

Sleeping

App Files Files Community

ninja-code-guard / docs /WEEK1_FOUNDATION_AND_SETUP.md

NinjainPJs

initial - commit

4b445f6 3 months ago

preview code

raw

history blame contribute delete

14.3 kB

Week 1: Foundation & Setup — Detailed Documentation

Goal: Project skeleton running locally, all external services provisioned. Status: Complete Date: 2026-03-19

What We Accomplished

Week 1 established the entire project foundation: directory structure, configuration system, data models, external service accounts, CI/CD pipeline, and the initial deployment config.

Step-by-Step Log

Step 1: Initialize the Project

What we did: Created the project directory structure following a modular Python backend architecture with clear separation of concerns.

Why this structure matters:

app/                    ← All backend application code lives here
  agents/               ← One file per agent (security, performance, style, synthesizer)
  tools/                ← LangChain tool wrappers (semgrep, bandit, radon, etc.)
  context/              ← RAG pipeline (embedder → indexer → retriever)
  github/               ← All GitHub API interaction (webhook, auth, client, formatter)
  models/               ← Pydantic data models (Finding, PRReview, webhook payloads)
  db/                   ← Database & cache (Postgres, Redis)
  services/             ← Business logic (orchestrator, health score calculator)
dashboard/              ← Next.js frontend (deployed separately to Vercel)
tests/                  ← Mirrors the app/ structure (unit/, integration/, eval/)
prompts/                ← Agent system prompts as Markdown files
knowledge/              ← RAG knowledge bases (OWASP, DDIA, style guides)
docs/                   ← Project documentation (this file)

Key principle: Each directory has a single responsibility. The agents/ folder doesn't know about GitHub. The github/ folder doesn't know about LangChain. The services/ folder orchestrates between them. This is called separation of concerns — it makes the code testable, maintainable, and easy to explain in interviews.

Commands run:

# Create all directories
mkdir -p app/{agents,tools,context,github,models,db,services}
mkdir -p dashboard/{app/{repos,api},components,lib}
mkdir -p tests/{unit,integration,eval/dataset}
mkdir -p prompts knowledge/style_guides

# Create __init__.py files (makes directories Python packages)
touch app/__init__.py app/agents/__init__.py app/tools/__init__.py ...

# Initialize git
git init && git branch -m main

Step 2: Create Configuration System (app/config.py)

What we did: Created a centralized configuration file using pydantic-settings.

How it works:

from pydantic_settings import BaseSettings

class Settings(BaseSettings):
    groq_api_key: str = ""
    github_app_id: str = ""
    # ... all config vars

    model_config = {"env_file": ".env"}

settings = Settings()  # Singleton — imported everywhere

Why pydantic-settings instead of plain os.environ?

Type safety — confidence_threshold: float = 0.6 ensures it's a float, not a string
Validation — pydantic raises clear errors if required vars are missing
Defaults — each setting has a sensible default for development
Auto-loads .env — reads from .env file automatically (via model_config)
IDE autocomplete — settings.groq_api_key instead of os.environ.get("GROQ_API_KEY")

Interview talking point: "We use pydantic-settings for type-safe configuration management following the 12-factor app methodology — config lives in environment variables, not in code. This makes the same codebase work in development, staging, and production with zero code changes."

Step 3: Define Data Models (app/models/findings.py)

What we did: Created Pydantic models that define the exact shape of data flowing through the system.

Three core models:

Finding — Output of each domain agent

class Finding(BaseModel):
    agent: Literal["security", "performance", "style"]  # Which agent found this
    file_path: str              # e.g. "src/auth/login.py"
    line_start: int             # Where the issue starts
    line_end: int               # Where the issue ends
    severity: Literal["critical", "high", "medium", "low"]  # How bad is it
    category: str               # e.g. "sql_injection", "n+1_query"
    title: str                  # One-liner for the inline comment header
    description: str            # Full explanation
    suggested_fix: str          # Corrected code snippet
    cwe_id: Optional[str]       # CWE ID for security findings (e.g. "CWE-89")
    confidence: float           # 0.0–1.0, how sure the agent is

SynthesizedReview — Output of the Synthesizer Agent

class SynthesizedReview(BaseModel):
    health_score: int           # 0-100 (the headline metric)
    executive_summary: str      # 3-5 sentences for PR description
    recommendation: Literal["approve", "request_changes", "block"]
    findings: list[Finding]     # Deduplicated, re-ranked findings
    critical_count: int         # Counts by severity
    # ...

PRReviewRecord — What gets stored in Postgres

class PRReviewRecord(BaseModel):
    id: UUID                    # Primary key
    repo_full_name: str         # "ninjacode911/myapp"
    pr_number: int
    commit_sha: str
    health_score: int
    findings: list[Finding]     # Full findings as JSONB
    duration_ms: int            # How long the review took

Why Pydantic models instead of plain dicts?

Validation — severity: Literal["critical", "high", "medium", "low"] rejects invalid values
Serialization — .model_dump() converts to dict, .model_dump_json() to JSON
Documentation — the schema IS the documentation
Type checking — mypy catches bugs at development time, not production

Interview talking point: "Every data boundary in the system uses Pydantic models — agent outputs, API responses, database records. This gives us runtime validation, IDE autocomplete, and auto-generated OpenAPI docs. If an agent returns malformed JSON, Pydantic catches it immediately instead of letting bad data propagate through the pipeline."

Step 4: Define Webhook Payload Models (app/models/webhook_payloads.py)

What we did: Created typed models for GitHub's webhook JSON payloads.

Why type the webhook payload? GitHub sends complex nested JSON. Without types, you'd write:

sha = payload["pull_request"]["head"]["sha"]  # Easy to typo, no autocomplete

With Pydantic models:

event = PullRequestEvent(**payload)
sha = event.pull_request.head.sha  # Autocomplete, type-checked

We didn't use these models in the final webhook handler (we used raw dict access for simplicity), but they're available for stricter validation later.

Step 5: Create FastAPI Skeleton (app/main.py)

What we did: Created the FastAPI application with a /health endpoint.

app = FastAPI(title="Ninja Code Guard", version="0.1.0")

@app.get("/health")
async def health_check():
    return {"status": "ok", "service": "Ninja Code Guard", "version": "0.1.0"}

Why a /health endpoint?

Render.com uses it to know if your service is alive (configured in render.yaml)
GitHub Actions cron pings it every 10 minutes to prevent cold starts
The dashboard calls it to show service status
Load balancers (if you scale up) use it to route traffic only to healthy instances

Step 6: Provision External Services

What we did: Created accounts and obtained credentials for all external services.

6a. GitHub App — "Ninja's Code Guard"

Where: github.com/settings/apps/new

What we configured:

Setting	Value	Reason
Name	Ninja Code Guard	Bot identity: `ninjas-code-guard[bot]`
Homepage URL	github.com/ninjacode911/codeprobe	Points to our repo
Webhook Active	Yes	We need to receive PR events
Webhook Secret	(generated with `python -c "import secrets; print(secrets.token_hex(32))"`)	HMAC authentication
Contents	Read	Fetch full file source code for RAG context
Pull requests	Read & Write	Read diffs, post review comments
Commit statuses	Write	Show health score as commit status check
Metadata	Read	Required — basic repo info
Events	pull_request, pull_request_review_comment	Our trigger events
Install target	Only this account	Dev-mode only for now

What we got:

App ID: 3133457
Private Key: .pem file saved to keys/ninja-s-code-guard.2026-03-19.private-key.pem
Webhook Secret: saved to .env

How GitHub App authentication works (important concept):

Step 1: Sign a JWT with our private key (.pem)
        JWT payload = {iss: APP_ID, iat: now, exp: now+9min}
        Signed with RS256 (RSA + SHA-256)
        This proves: "I am the Ninja Code Guard app"

Step 2: Exchange JWT for an installation access token
        POST /app/installations/{id}/access_tokens
        Headers: Authorization: Bearer <JWT>
        Returns: token valid for 1 hour, scoped to installed repos
        This proves: "I can access ninjacode911's repos"

Step 3: Use installation token for all API calls
        GET /repos/ninjacode911/codeguard-test/pulls/1
        Headers: Authorization: token <installation_token>

6b. Groq API

Where: console.groq.com What: API key for Llama-3.1-70B inference (14,400 free requests/day) Saved as: GROQ_API_KEY in .env

6c. Neon.tech Postgres

Where: console.neon.tech What: Serverless Postgres database (512MB free tier) Saved as: DATABASE_URL in .env Used for: Storing PR review history, health score trends, finding details

6d. Upstash Redis

Where: console.upstash.com What: Serverless Redis (10K requests/day free tier) Saved as: UPSTASH_REDIS_URL in .env Used for: Caching reviewed commit SHAs to prevent duplicate analysis

Step 7: Create Configuration Files

.env.example

Template showing all required environment variables without actual values. Committed to git so new developers know what to configure.

.gitignore

Prevents sensitive files from being committed:

.env (contains API keys)
keys/ (contains private key .pem)
__pycache__/, .venv/ (generated files)
chroma_data/ (vector store data)
dashboard/node_modules/, dashboard/.next/ (Node.js generated)

pyproject.toml

Project metadata + tool configuration:

[tool.ruff] — Python linter settings
[tool.pytest] — Test configuration (asyncio mode, test paths)
[tool.mypy] — Type checker settings

render.yaml

Render.com deployment configuration:

services:
  - type: web
    name: ninja-code-guard
    buildCommand: pip install -r requirements.txt
    startCommand: uvicorn app.main:app --host 0.0.0.0 --port $PORT
    healthCheckPath: /health
    plan: free

sentinel.yml.example

Per-repo configuration template that users place in their repo root:

agents:
  security: true
  performance: true
  style: true
min_severity: low
min_confidence: 0.6
exclude:
  - "vendor/"
  - "node_modules/"

Step 8: Set Up CI/CD (GitHub Actions)

Created two workflows:

ci.yml — Runs on every push/PR

steps:
  - Lint with ruff (catches style/import issues)
  - Type check with mypy (catches type errors)
  - Run tests with pytest

prewarm.yml — Cron job every 10 minutes on weekdays

schedule: "*/10 6-20 * * 1-5"  # Every 10min, 6am-8pm UTC, Mon-Fri
steps:
  - curl the /health endpoint to prevent Render cold starts

Why pre-warm? Render's free tier spins down after 15 minutes of inactivity. The first request after spindown takes ~30 seconds (cold start). By pinging /health every 10 minutes during working hours, the service stays warm and responds instantly to webhooks.

Step 9: Write Initial Tests

Created: tests/unit/test_findings_schema.py — 8 tests for data model validation

These tests verify:

Valid Finding objects are accepted
Invalid agent types are rejected
Invalid severity levels are rejected
Confidence must be between 0.0 and 1.0
CWE ID is optional (None allowed)
Health score must be 0-100
Invalid recommendation values are rejected

Files Created in Week 1

File	Purpose
`app/__init__.py`	Makes app a Python package
`app/config.py`	Centralized configuration via environment variables
`app/main.py`	FastAPI app with /health endpoint (expanded in Week 2)
`app/models/__init__.py`	Models package
`app/models/findings.py`	Finding, SynthesizedReview, PRReviewRecord schemas
`app/models/webhook_payloads.py`	GitHub webhook event payload types
`tests/conftest.py`	Shared test fixtures (sample finding data)
`tests/unit/test_findings_schema.py`	8 schema validation tests
`.env`	Environment variables (gitignored — contains secrets)
`.env.example`	Template for .env (committed — no secrets)
`.gitignore`	Files to exclude from git
`pyproject.toml`	Project metadata + tool configs
`requirements.txt`	Python production dependencies
`requirements-dev.txt`	Dev/test dependencies
`render.yaml`	Render.com deployment config
`sentinel.yml.example`	Per-repo config template
`.github/workflows/ci.yml`	CI pipeline (lint + test)
`.github/workflows/prewarm.yml`	Render pre-warm cron
`keys/.gitignore`	Prevents .pem files from being committed
`PROJECT_PLAN.md`	Master project plan + progress tracker

Key Decisions Made

Decision	Rationale
Pydantic for all data models	Runtime validation + IDE autocomplete + auto-docs
pydantic-settings for config	Type-safe env vars, auto-loads .env, 12-factor pattern
FastAPI (not Flask/Django)	Async-native (needed for parallel agents), auto OpenAPI docs, modern Python
GitHub App (not Action)	One deployment serves all repos, webhook-driven, own bot identity
Upstash Redis (not in-memory cache)	Persists across Render restarts, shared across workers
Neon.tech (not SQLite)	Serverless, accessible from dashboard, persistent storage

Documentation written 2026-03-19 as part of Week 1 completion.