Spaces:

NinjainPJs
/

ninja-code-guard

Sleeping

App Files Files Community

ninja-code-guard / docs /WEEK1_FOUNDATION_AND_SETUP.md

NinjainPJs

initial - commit

4b445f6 3 months ago

preview code

raw

history blame contribute delete

14.3 kB

	# Week 1: Foundation & Setup — Detailed Documentation

	> Goal: Project skeleton running locally, all external services provisioned.
	> Status: Complete
	> Date: 2026-03-19

	---

	## What We Accomplished

	Week 1 established the entire project foundation: directory structure, configuration system,
	data models, external service accounts, CI/CD pipeline, and the initial deployment config.

	---

	## Step-by-Step Log

	### Step 1: Initialize the Project

	What we did: Created the project directory structure following a modular Python backend
	architecture with clear separation of concerns.

	Why this structure matters:
	```
	app/ ← All backend application code lives here
	agents/ ← One file per agent (security, performance, style, synthesizer)
	tools/ ← LangChain tool wrappers (semgrep, bandit, radon, etc.)
	context/ ← RAG pipeline (embedder → indexer → retriever)
	github/ ← All GitHub API interaction (webhook, auth, client, formatter)
	models/ ← Pydantic data models (Finding, PRReview, webhook payloads)
	db/ ← Database & cache (Postgres, Redis)
	services/ ← Business logic (orchestrator, health score calculator)
	dashboard/ ← Next.js frontend (deployed separately to Vercel)
	tests/ ← Mirrors the app/ structure (unit/, integration/, eval/)
	prompts/ ← Agent system prompts as Markdown files
	knowledge/ ← RAG knowledge bases (OWASP, DDIA, style guides)
	docs/ ← Project documentation (this file)
	```

	Key principle: Each directory has a single responsibility. The `agents/` folder doesn't
	know about GitHub. The `github/` folder doesn't know about LangChain. The `services/`
	folder orchestrates between them. This is called separation of concerns — it makes the
	code testable, maintainable, and easy to explain in interviews.

	Commands run:
	```bash
	# Create all directories
	mkdir -p app/{agents,tools,context,github,models,db,services}
	mkdir -p dashboard/{app/{repos,api},components,lib}
	mkdir -p tests/{unit,integration,eval/dataset}
	mkdir -p prompts knowledge/style_guides

	# Create __init__.py files (makes directories Python packages)
	touch app/__init__.py app/agents/__init__.py app/tools/__init__.py ...

	# Initialize git
	git init && git branch -m main
	```

	### Step 2: Create Configuration System (app/config.py)

	What we did: Created a centralized configuration file using `pydantic-settings`.

	How it works:
	```python
	from pydantic_settings import BaseSettings

	class Settings(BaseSettings):
	groq_api_key: str = ""
	github_app_id: str = ""
	# ... all config vars

	model_config = {"env_file": ".env"}

	settings = Settings() # Singleton — imported everywhere
	```

	Why pydantic-settings instead of plain os.environ?
	1. Type safety — `confidence_threshold: float = 0.6` ensures it's a float, not a string
	2. Validation — pydantic raises clear errors if required vars are missing
	3. Defaults — each setting has a sensible default for development
	4. Auto-loads .env — reads from `.env` file automatically (via `model_config`)
	5. IDE autocomplete — `settings.groq_api_key` instead of `os.environ.get("GROQ_API_KEY")`

	Interview talking point: "We use pydantic-settings for type-safe configuration management
	following the 12-factor app methodology — config lives in environment variables, not in code.
	This makes the same codebase work in development, staging, and production with zero code changes."

	### Step 3: Define Data Models (app/models/findings.py)

	What we did: Created Pydantic models that define the exact shape of data flowing through
	the system.

	Three core models:

	#### Finding — Output of each domain agent
	```python
	class Finding(BaseModel):
	agent: Literal["security", "performance", "style"] # Which agent found this
	file_path: str # e.g. "src/auth/login.py"
	line_start: int # Where the issue starts
	line_end: int # Where the issue ends
	severity: Literal["critical", "high", "medium", "low"] # How bad is it
	category: str # e.g. "sql_injection", "n+1_query"
	title: str # One-liner for the inline comment header
	description: str # Full explanation
	suggested_fix: str # Corrected code snippet
	cwe_id: Optional[str] # CWE ID for security findings (e.g. "CWE-89")
	confidence: float # 0.0–1.0, how sure the agent is
	```

	#### SynthesizedReview — Output of the Synthesizer Agent
	```python
	class SynthesizedReview(BaseModel):
	health_score: int # 0-100 (the headline metric)
	executive_summary: str # 3-5 sentences for PR description
	recommendation: Literal["approve", "request_changes", "block"]
	findings: list[Finding] # Deduplicated, re-ranked findings
	critical_count: int # Counts by severity
	# ...
	```

	#### PRReviewRecord — What gets stored in Postgres
	```python
	class PRReviewRecord(BaseModel):
	id: UUID # Primary key
	repo_full_name: str # "ninjacode911/myapp"
	pr_number: int
	commit_sha: str
	health_score: int
	findings: list[Finding] # Full findings as JSONB
	duration_ms: int # How long the review took
	```

	Why Pydantic models instead of plain dicts?
	1. Validation — `severity: Literal["critical", "high", "medium", "low"]` rejects invalid values
	2. Serialization — `.model_dump()` converts to dict, `.model_dump_json()` to JSON
	3. Documentation — the schema IS the documentation
	4. Type checking — mypy catches bugs at development time, not production

	Interview talking point: "Every data boundary in the system uses Pydantic models — agent
	outputs, API responses, database records. This gives us runtime validation, IDE autocomplete,
	and auto-generated OpenAPI docs. If an agent returns malformed JSON, Pydantic catches it
	immediately instead of letting bad data propagate through the pipeline."

	### Step 4: Define Webhook Payload Models (app/models/webhook_payloads.py)

	What we did: Created typed models for GitHub's webhook JSON payloads.

	Why type the webhook payload?
	GitHub sends complex nested JSON. Without types, you'd write:
	```python
	sha = payload["pull_request"]["head"]["sha"] # Easy to typo, no autocomplete
	```
	With Pydantic models:
	```python
	event = PullRequestEvent(**payload)
	sha = event.pull_request.head.sha # Autocomplete, type-checked
	```

	We didn't use these models in the final webhook handler (we used raw dict access for
	simplicity), but they're available for stricter validation later.

	### Step 5: Create FastAPI Skeleton (app/main.py)

	What we did: Created the FastAPI application with a `/health` endpoint.

	```python
	app = FastAPI(title="Ninja Code Guard", version="0.1.0")

	@app.get("/health")
	async def health_check():
	return {"status": "ok", "service": "Ninja Code Guard", "version": "0.1.0"}
	```

	Why a /health endpoint?
	- Render.com uses it to know if your service is alive (configured in render.yaml)
	- GitHub Actions cron pings it every 10 minutes to prevent cold starts
	- The dashboard calls it to show service status
	- Load balancers (if you scale up) use it to route traffic only to healthy instances

	### Step 6: Provision External Services

	What we did: Created accounts and obtained credentials for all external services.

	#### 6a. GitHub App — "Ninja's Code Guard"

	Where: github.com/settings/apps/new

	What we configured:
	\| Setting \| Value \| Reason \|
	\|---------\|-------\|--------\|
	\| Name \| Ninja Code Guard \| Bot identity: `ninjas-code-guard[bot]` \|
	\| Homepage URL \| github.com/ninjacode911/codeprobe \| Points to our repo \|
	\| Webhook Active \| Yes \| We need to receive PR events \|
	\| Webhook Secret \| (generated with `python -c "import secrets; print(secrets.token_hex(32))"`) \| HMAC authentication \|
	\| Contents \| Read \| Fetch full file source code for RAG context \|
	\| Pull requests \| Read & Write \| Read diffs, post review comments \|
	\| Commit statuses \| Write \| Show health score as commit status check \|
	\| Metadata \| Read \| Required — basic repo info \|
	\| Events \| pull_request, pull_request_review_comment \| Our trigger events \|
	\| Install target \| Only this account \| Dev-mode only for now \|

	What we got:
	- App ID: 3133457
	- Private Key: `.pem` file saved to `keys/ninja-s-code-guard.2026-03-19.private-key.pem`
	- Webhook Secret: saved to `.env`

	How GitHub App authentication works (important concept):
	```
	Step 1: Sign a JWT with our private key (.pem)
	JWT payload = {iss: APP_ID, iat: now, exp: now+9min}
	Signed with RS256 (RSA + SHA-256)
	This proves: "I am the Ninja Code Guard app"

	Step 2: Exchange JWT for an installation access token
	POST /app/installations/{id}/access_tokens
	Headers: Authorization: Bearer <JWT>
	Returns: token valid for 1 hour, scoped to installed repos
	This proves: "I can access ninjacode911's repos"

	Step 3: Use installation token for all API calls
	GET /repos/ninjacode911/codeguard-test/pulls/1
	Headers: Authorization: token <installation_token>
	```

	#### 6b. Groq API

	Where: console.groq.com
	What: API key for Llama-3.1-70B inference (14,400 free requests/day)
	Saved as: `GROQ_API_KEY` in `.env`

	#### 6c. Neon.tech Postgres

	Where: console.neon.tech
	What: Serverless Postgres database (512MB free tier)
	Saved as: `DATABASE_URL` in `.env`
	Used for: Storing PR review history, health score trends, finding details

	#### 6d. Upstash Redis

	Where: console.upstash.com
	What: Serverless Redis (10K requests/day free tier)
	Saved as: `UPSTASH_REDIS_URL` in `.env`
	Used for: Caching reviewed commit SHAs to prevent duplicate analysis

	### Step 7: Create Configuration Files

	#### .env.example
	Template showing all required environment variables without actual values.
	Committed to git so new developers know what to configure.

	#### .gitignore
	Prevents sensitive files from being committed:
	- `.env` (contains API keys)
	- `keys/` (contains private key .pem)
	- `__pycache__/`, `.venv/` (generated files)
	- `chroma_data/` (vector store data)
	- `dashboard/node_modules/`, `dashboard/.next/` (Node.js generated)

	#### pyproject.toml
	Project metadata + tool configuration:
	- `[tool.ruff]` — Python linter settings
	- `[tool.pytest]` — Test configuration (asyncio mode, test paths)
	- `[tool.mypy]` — Type checker settings

	#### render.yaml
	Render.com deployment configuration:
	```yaml
	services:
	- type: web
	name: ninja-code-guard
	buildCommand: pip install -r requirements.txt
	startCommand: uvicorn app.main:app --host 0.0.0.0 --port $PORT
	healthCheckPath: /health
	plan: free
	```

	#### sentinel.yml.example
	Per-repo configuration template that users place in their repo root:
	```yaml
	agents:
	security: true
	performance: true
	style: true
	min_severity: low
	min_confidence: 0.6
	exclude:
	- "vendor/"
	- "node_modules/"
	```

	### Step 8: Set Up CI/CD (GitHub Actions)

	Created two workflows:

	#### ci.yml — Runs on every push/PR
	```yaml
	steps:
	- Lint with ruff (catches style/import issues)
	- Type check with mypy (catches type errors)
	- Run tests with pytest
	```

	#### prewarm.yml — Cron job every 10 minutes on weekdays
	```yaml
	schedule: "/10 6-20 * 1-5" # Every 10min, 6am-8pm UTC, Mon-Fri
	steps:
	- curl the /health endpoint to prevent Render cold starts
	```

	Why pre-warm? Render's free tier spins down after 15 minutes of inactivity. The first
	request after spindown takes ~30 seconds (cold start). By pinging /health every 10 minutes
	during working hours, the service stays warm and responds instantly to webhooks.

	### Step 9: Write Initial Tests

	Created: `tests/unit/test_findings_schema.py` — 8 tests for data model validation

	These tests verify:
	- Valid Finding objects are accepted
	- Invalid agent types are rejected
	- Invalid severity levels are rejected
	- Confidence must be between 0.0 and 1.0
	- CWE ID is optional (None allowed)
	- Health score must be 0-100
	- Invalid recommendation values are rejected

	---

	## Files Created in Week 1

	\| File \| Purpose \|
	\|------\|---------\|
	\| `app/__init__.py` \| Makes app a Python package \|
	\| `app/config.py` \| Centralized configuration via environment variables \|
	\| `app/main.py` \| FastAPI app with /health endpoint (expanded in Week 2) \|
	\| `app/models/__init__.py` \| Models package \|
	\| `app/models/findings.py` \| Finding, SynthesizedReview, PRReviewRecord schemas \|
	\| `app/models/webhook_payloads.py` \| GitHub webhook event payload types \|
	\| `tests/conftest.py` \| Shared test fixtures (sample finding data) \|
	\| `tests/unit/test_findings_schema.py` \| 8 schema validation tests \|
	\| `.env` \| Environment variables (gitignored — contains secrets) \|
	\| `.env.example` \| Template for .env (committed — no secrets) \|
	\| `.gitignore` \| Files to exclude from git \|
	\| `pyproject.toml` \| Project metadata + tool configs \|
	\| `requirements.txt` \| Python production dependencies \|
	\| `requirements-dev.txt` \| Dev/test dependencies \|
	\| `render.yaml` \| Render.com deployment config \|
	\| `sentinel.yml.example` \| Per-repo config template \|
	\| `.github/workflows/ci.yml` \| CI pipeline (lint + test) \|
	\| `.github/workflows/prewarm.yml` \| Render pre-warm cron \|
	\| `keys/.gitignore` \| Prevents .pem files from being committed \|
	\| `PROJECT_PLAN.md` \| Master project plan + progress tracker \|

	---

	## Key Decisions Made

	\| Decision \| Rationale \|
	\|----------\|-----------\|
	\| Pydantic for all data models \| Runtime validation + IDE autocomplete + auto-docs \|
	\| pydantic-settings for config \| Type-safe env vars, auto-loads .env, 12-factor pattern \|
	\| FastAPI (not Flask/Django) \| Async-native (needed for parallel agents), auto OpenAPI docs, modern Python \|
	\| GitHub App (not Action) \| One deployment serves all repos, webhook-driven, own bot identity \|
	\| Upstash Redis (not in-memory cache) \| Persists across Render restarts, shared across workers \|
	\| Neon.tech (not SQLite) \| Serverless, accessible from dashboard, persistent storage \|

	---

	Documentation written 2026-03-19 as part of Week 1 completion.