Spaces:
Sleeping
Sleeping
File size: 14,318 Bytes
4b445f6 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 | # Week 1: Foundation & Setup β Detailed Documentation
> **Goal:** Project skeleton running locally, all external services provisioned.
> **Status:** Complete
> **Date:** 2026-03-19
---
## What We Accomplished
Week 1 established the entire project foundation: directory structure, configuration system,
data models, external service accounts, CI/CD pipeline, and the initial deployment config.
---
## Step-by-Step Log
### Step 1: Initialize the Project
**What we did:** Created the project directory structure following a modular Python backend
architecture with clear separation of concerns.
**Why this structure matters:**
```
app/ β All backend application code lives here
agents/ β One file per agent (security, performance, style, synthesizer)
tools/ β LangChain tool wrappers (semgrep, bandit, radon, etc.)
context/ β RAG pipeline (embedder β indexer β retriever)
github/ β All GitHub API interaction (webhook, auth, client, formatter)
models/ β Pydantic data models (Finding, PRReview, webhook payloads)
db/ β Database & cache (Postgres, Redis)
services/ β Business logic (orchestrator, health score calculator)
dashboard/ β Next.js frontend (deployed separately to Vercel)
tests/ β Mirrors the app/ structure (unit/, integration/, eval/)
prompts/ β Agent system prompts as Markdown files
knowledge/ β RAG knowledge bases (OWASP, DDIA, style guides)
docs/ β Project documentation (this file)
```
**Key principle:** Each directory has a single responsibility. The `agents/` folder doesn't
know about GitHub. The `github/` folder doesn't know about LangChain. The `services/`
folder orchestrates between them. This is called **separation of concerns** β it makes the
code testable, maintainable, and easy to explain in interviews.
**Commands run:**
```bash
# Create all directories
mkdir -p app/{agents,tools,context,github,models,db,services}
mkdir -p dashboard/{app/{repos,api},components,lib}
mkdir -p tests/{unit,integration,eval/dataset}
mkdir -p prompts knowledge/style_guides
# Create __init__.py files (makes directories Python packages)
touch app/__init__.py app/agents/__init__.py app/tools/__init__.py ...
# Initialize git
git init && git branch -m main
```
### Step 2: Create Configuration System (app/config.py)
**What we did:** Created a centralized configuration file using `pydantic-settings`.
**How it works:**
```python
from pydantic_settings import BaseSettings
class Settings(BaseSettings):
groq_api_key: str = ""
github_app_id: str = ""
# ... all config vars
model_config = {"env_file": ".env"}
settings = Settings() # Singleton β imported everywhere
```
**Why pydantic-settings instead of plain os.environ?**
1. **Type safety** β `confidence_threshold: float = 0.6` ensures it's a float, not a string
2. **Validation** β pydantic raises clear errors if required vars are missing
3. **Defaults** β each setting has a sensible default for development
4. **Auto-loads .env** β reads from `.env` file automatically (via `model_config`)
5. **IDE autocomplete** β `settings.groq_api_key` instead of `os.environ.get("GROQ_API_KEY")`
**Interview talking point:** "We use pydantic-settings for type-safe configuration management
following the 12-factor app methodology β config lives in environment variables, not in code.
This makes the same codebase work in development, staging, and production with zero code changes."
### Step 3: Define Data Models (app/models/findings.py)
**What we did:** Created Pydantic models that define the exact shape of data flowing through
the system.
**Three core models:**
#### Finding β Output of each domain agent
```python
class Finding(BaseModel):
agent: Literal["security", "performance", "style"] # Which agent found this
file_path: str # e.g. "src/auth/login.py"
line_start: int # Where the issue starts
line_end: int # Where the issue ends
severity: Literal["critical", "high", "medium", "low"] # How bad is it
category: str # e.g. "sql_injection", "n+1_query"
title: str # One-liner for the inline comment header
description: str # Full explanation
suggested_fix: str # Corrected code snippet
cwe_id: Optional[str] # CWE ID for security findings (e.g. "CWE-89")
confidence: float # 0.0β1.0, how sure the agent is
```
#### SynthesizedReview β Output of the Synthesizer Agent
```python
class SynthesizedReview(BaseModel):
health_score: int # 0-100 (the headline metric)
executive_summary: str # 3-5 sentences for PR description
recommendation: Literal["approve", "request_changes", "block"]
findings: list[Finding] # Deduplicated, re-ranked findings
critical_count: int # Counts by severity
# ...
```
#### PRReviewRecord β What gets stored in Postgres
```python
class PRReviewRecord(BaseModel):
id: UUID # Primary key
repo_full_name: str # "ninjacode911/myapp"
pr_number: int
commit_sha: str
health_score: int
findings: list[Finding] # Full findings as JSONB
duration_ms: int # How long the review took
```
**Why Pydantic models instead of plain dicts?**
1. **Validation** β `severity: Literal["critical", "high", "medium", "low"]` rejects invalid values
2. **Serialization** β `.model_dump()` converts to dict, `.model_dump_json()` to JSON
3. **Documentation** β the schema IS the documentation
4. **Type checking** β mypy catches bugs at development time, not production
**Interview talking point:** "Every data boundary in the system uses Pydantic models β agent
outputs, API responses, database records. This gives us runtime validation, IDE autocomplete,
and auto-generated OpenAPI docs. If an agent returns malformed JSON, Pydantic catches it
immediately instead of letting bad data propagate through the pipeline."
### Step 4: Define Webhook Payload Models (app/models/webhook_payloads.py)
**What we did:** Created typed models for GitHub's webhook JSON payloads.
**Why type the webhook payload?**
GitHub sends complex nested JSON. Without types, you'd write:
```python
sha = payload["pull_request"]["head"]["sha"] # Easy to typo, no autocomplete
```
With Pydantic models:
```python
event = PullRequestEvent(**payload)
sha = event.pull_request.head.sha # Autocomplete, type-checked
```
We didn't use these models in the final webhook handler (we used raw dict access for
simplicity), but they're available for stricter validation later.
### Step 5: Create FastAPI Skeleton (app/main.py)
**What we did:** Created the FastAPI application with a `/health` endpoint.
```python
app = FastAPI(title="Ninja Code Guard", version="0.1.0")
@app.get("/health")
async def health_check():
return {"status": "ok", "service": "Ninja Code Guard", "version": "0.1.0"}
```
**Why a /health endpoint?**
- **Render.com** uses it to know if your service is alive (configured in render.yaml)
- **GitHub Actions cron** pings it every 10 minutes to prevent cold starts
- **The dashboard** calls it to show service status
- **Load balancers** (if you scale up) use it to route traffic only to healthy instances
### Step 6: Provision External Services
**What we did:** Created accounts and obtained credentials for all external services.
#### 6a. GitHub App β "Ninja's Code Guard"
**Where:** github.com/settings/apps/new
**What we configured:**
| Setting | Value | Reason |
|---------|-------|--------|
| Name | Ninja Code Guard | Bot identity: `ninjas-code-guard[bot]` |
| Homepage URL | github.com/ninjacode911/codeprobe | Points to our repo |
| Webhook Active | Yes | We need to receive PR events |
| Webhook Secret | (generated with `python -c "import secrets; print(secrets.token_hex(32))"`) | HMAC authentication |
| Contents | Read | Fetch full file source code for RAG context |
| Pull requests | Read & Write | Read diffs, post review comments |
| Commit statuses | Write | Show health score as commit status check |
| Metadata | Read | Required β basic repo info |
| Events | pull_request, pull_request_review_comment | Our trigger events |
| Install target | Only this account | Dev-mode only for now |
**What we got:**
- App ID: 3133457
- Private Key: `.pem` file saved to `keys/ninja-s-code-guard.2026-03-19.private-key.pem`
- Webhook Secret: saved to `.env`
**How GitHub App authentication works (important concept):**
```
Step 1: Sign a JWT with our private key (.pem)
JWT payload = {iss: APP_ID, iat: now, exp: now+9min}
Signed with RS256 (RSA + SHA-256)
This proves: "I am the Ninja Code Guard app"
Step 2: Exchange JWT for an installation access token
POST /app/installations/{id}/access_tokens
Headers: Authorization: Bearer <JWT>
Returns: token valid for 1 hour, scoped to installed repos
This proves: "I can access ninjacode911's repos"
Step 3: Use installation token for all API calls
GET /repos/ninjacode911/codeguard-test/pulls/1
Headers: Authorization: token <installation_token>
```
#### 6b. Groq API
**Where:** console.groq.com
**What:** API key for Llama-3.1-70B inference (14,400 free requests/day)
**Saved as:** `GROQ_API_KEY` in `.env`
#### 6c. Neon.tech Postgres
**Where:** console.neon.tech
**What:** Serverless Postgres database (512MB free tier)
**Saved as:** `DATABASE_URL` in `.env`
**Used for:** Storing PR review history, health score trends, finding details
#### 6d. Upstash Redis
**Where:** console.upstash.com
**What:** Serverless Redis (10K requests/day free tier)
**Saved as:** `UPSTASH_REDIS_URL` in `.env`
**Used for:** Caching reviewed commit SHAs to prevent duplicate analysis
### Step 7: Create Configuration Files
#### .env.example
Template showing all required environment variables without actual values.
Committed to git so new developers know what to configure.
#### .gitignore
Prevents sensitive files from being committed:
- `.env` (contains API keys)
- `keys/` (contains private key .pem)
- `__pycache__/`, `.venv/` (generated files)
- `chroma_data/` (vector store data)
- `dashboard/node_modules/`, `dashboard/.next/` (Node.js generated)
#### pyproject.toml
Project metadata + tool configuration:
- `[tool.ruff]` β Python linter settings
- `[tool.pytest]` β Test configuration (asyncio mode, test paths)
- `[tool.mypy]` β Type checker settings
#### render.yaml
Render.com deployment configuration:
```yaml
services:
- type: web
name: ninja-code-guard
buildCommand: pip install -r requirements.txt
startCommand: uvicorn app.main:app --host 0.0.0.0 --port $PORT
healthCheckPath: /health
plan: free
```
#### sentinel.yml.example
Per-repo configuration template that users place in their repo root:
```yaml
agents:
security: true
performance: true
style: true
min_severity: low
min_confidence: 0.6
exclude:
- "vendor/"
- "node_modules/"
```
### Step 8: Set Up CI/CD (GitHub Actions)
**Created two workflows:**
#### ci.yml β Runs on every push/PR
```yaml
steps:
- Lint with ruff (catches style/import issues)
- Type check with mypy (catches type errors)
- Run tests with pytest
```
#### prewarm.yml β Cron job every 10 minutes on weekdays
```yaml
schedule: "*/10 6-20 * * 1-5" # Every 10min, 6am-8pm UTC, Mon-Fri
steps:
- curl the /health endpoint to prevent Render cold starts
```
**Why pre-warm?** Render's free tier spins down after 15 minutes of inactivity. The first
request after spindown takes ~30 seconds (cold start). By pinging /health every 10 minutes
during working hours, the service stays warm and responds instantly to webhooks.
### Step 9: Write Initial Tests
**Created:** `tests/unit/test_findings_schema.py` β 8 tests for data model validation
These tests verify:
- Valid Finding objects are accepted
- Invalid agent types are rejected
- Invalid severity levels are rejected
- Confidence must be between 0.0 and 1.0
- CWE ID is optional (None allowed)
- Health score must be 0-100
- Invalid recommendation values are rejected
---
## Files Created in Week 1
| File | Purpose |
|------|---------|
| `app/__init__.py` | Makes app a Python package |
| `app/config.py` | Centralized configuration via environment variables |
| `app/main.py` | FastAPI app with /health endpoint (expanded in Week 2) |
| `app/models/__init__.py` | Models package |
| `app/models/findings.py` | Finding, SynthesizedReview, PRReviewRecord schemas |
| `app/models/webhook_payloads.py` | GitHub webhook event payload types |
| `tests/conftest.py` | Shared test fixtures (sample finding data) |
| `tests/unit/test_findings_schema.py` | 8 schema validation tests |
| `.env` | Environment variables (gitignored β contains secrets) |
| `.env.example` | Template for .env (committed β no secrets) |
| `.gitignore` | Files to exclude from git |
| `pyproject.toml` | Project metadata + tool configs |
| `requirements.txt` | Python production dependencies |
| `requirements-dev.txt` | Dev/test dependencies |
| `render.yaml` | Render.com deployment config |
| `sentinel.yml.example` | Per-repo config template |
| `.github/workflows/ci.yml` | CI pipeline (lint + test) |
| `.github/workflows/prewarm.yml` | Render pre-warm cron |
| `keys/.gitignore` | Prevents .pem files from being committed |
| `PROJECT_PLAN.md` | Master project plan + progress tracker |
---
## Key Decisions Made
| Decision | Rationale |
|----------|-----------|
| Pydantic for all data models | Runtime validation + IDE autocomplete + auto-docs |
| pydantic-settings for config | Type-safe env vars, auto-loads .env, 12-factor pattern |
| FastAPI (not Flask/Django) | Async-native (needed for parallel agents), auto OpenAPI docs, modern Python |
| GitHub App (not Action) | One deployment serves all repos, webhook-driven, own bot identity |
| Upstash Redis (not in-memory cache) | Persists across Render restarts, shared across workers |
| Neon.tech (not SQLite) | Serverless, accessible from dashboard, persistent storage |
---
*Documentation written 2026-03-19 as part of Week 1 completion.*
|