File size: 14,318 Bytes
4b445f6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
# Week 1: Foundation & Setup β€” Detailed Documentation

> **Goal:** Project skeleton running locally, all external services provisioned.
> **Status:** Complete
> **Date:** 2026-03-19

---

## What We Accomplished

Week 1 established the entire project foundation: directory structure, configuration system,
data models, external service accounts, CI/CD pipeline, and the initial deployment config.

---

## Step-by-Step Log

### Step 1: Initialize the Project

**What we did:** Created the project directory structure following a modular Python backend
architecture with clear separation of concerns.

**Why this structure matters:**
```
app/                    ← All backend application code lives here
  agents/               ← One file per agent (security, performance, style, synthesizer)
  tools/                ← LangChain tool wrappers (semgrep, bandit, radon, etc.)
  context/              ← RAG pipeline (embedder β†’ indexer β†’ retriever)
  github/               ← All GitHub API interaction (webhook, auth, client, formatter)
  models/               ← Pydantic data models (Finding, PRReview, webhook payloads)
  db/                   ← Database & cache (Postgres, Redis)
  services/             ← Business logic (orchestrator, health score calculator)
dashboard/              ← Next.js frontend (deployed separately to Vercel)
tests/                  ← Mirrors the app/ structure (unit/, integration/, eval/)
prompts/                ← Agent system prompts as Markdown files
knowledge/              ← RAG knowledge bases (OWASP, DDIA, style guides)
docs/                   ← Project documentation (this file)
```

**Key principle:** Each directory has a single responsibility. The `agents/` folder doesn't
know about GitHub. The `github/` folder doesn't know about LangChain. The `services/`
folder orchestrates between them. This is called **separation of concerns** β€” it makes the
code testable, maintainable, and easy to explain in interviews.

**Commands run:**
```bash
# Create all directories
mkdir -p app/{agents,tools,context,github,models,db,services}
mkdir -p dashboard/{app/{repos,api},components,lib}
mkdir -p tests/{unit,integration,eval/dataset}
mkdir -p prompts knowledge/style_guides

# Create __init__.py files (makes directories Python packages)
touch app/__init__.py app/agents/__init__.py app/tools/__init__.py ...

# Initialize git
git init && git branch -m main
```

### Step 2: Create Configuration System (app/config.py)

**What we did:** Created a centralized configuration file using `pydantic-settings`.

**How it works:**
```python
from pydantic_settings import BaseSettings

class Settings(BaseSettings):
    groq_api_key: str = ""
    github_app_id: str = ""
    # ... all config vars

    model_config = {"env_file": ".env"}

settings = Settings()  # Singleton β€” imported everywhere
```

**Why pydantic-settings instead of plain os.environ?**
1. **Type safety** β€” `confidence_threshold: float = 0.6` ensures it's a float, not a string
2. **Validation** β€” pydantic raises clear errors if required vars are missing
3. **Defaults** β€” each setting has a sensible default for development
4. **Auto-loads .env** β€” reads from `.env` file automatically (via `model_config`)
5. **IDE autocomplete** β€” `settings.groq_api_key` instead of `os.environ.get("GROQ_API_KEY")`

**Interview talking point:** "We use pydantic-settings for type-safe configuration management
following the 12-factor app methodology β€” config lives in environment variables, not in code.
This makes the same codebase work in development, staging, and production with zero code changes."

### Step 3: Define Data Models (app/models/findings.py)

**What we did:** Created Pydantic models that define the exact shape of data flowing through
the system.

**Three core models:**

#### Finding β€” Output of each domain agent
```python
class Finding(BaseModel):
    agent: Literal["security", "performance", "style"]  # Which agent found this
    file_path: str              # e.g. "src/auth/login.py"
    line_start: int             # Where the issue starts
    line_end: int               # Where the issue ends
    severity: Literal["critical", "high", "medium", "low"]  # How bad is it
    category: str               # e.g. "sql_injection", "n+1_query"
    title: str                  # One-liner for the inline comment header
    description: str            # Full explanation
    suggested_fix: str          # Corrected code snippet
    cwe_id: Optional[str]       # CWE ID for security findings (e.g. "CWE-89")
    confidence: float           # 0.0–1.0, how sure the agent is
```

#### SynthesizedReview β€” Output of the Synthesizer Agent
```python
class SynthesizedReview(BaseModel):
    health_score: int           # 0-100 (the headline metric)
    executive_summary: str      # 3-5 sentences for PR description
    recommendation: Literal["approve", "request_changes", "block"]
    findings: list[Finding]     # Deduplicated, re-ranked findings
    critical_count: int         # Counts by severity
    # ...
```

#### PRReviewRecord β€” What gets stored in Postgres
```python
class PRReviewRecord(BaseModel):
    id: UUID                    # Primary key
    repo_full_name: str         # "ninjacode911/myapp"
    pr_number: int
    commit_sha: str
    health_score: int
    findings: list[Finding]     # Full findings as JSONB
    duration_ms: int            # How long the review took
```

**Why Pydantic models instead of plain dicts?**
1. **Validation** β€” `severity: Literal["critical", "high", "medium", "low"]` rejects invalid values
2. **Serialization** β€” `.model_dump()` converts to dict, `.model_dump_json()` to JSON
3. **Documentation** β€” the schema IS the documentation
4. **Type checking** β€” mypy catches bugs at development time, not production

**Interview talking point:** "Every data boundary in the system uses Pydantic models β€” agent
outputs, API responses, database records. This gives us runtime validation, IDE autocomplete,
and auto-generated OpenAPI docs. If an agent returns malformed JSON, Pydantic catches it
immediately instead of letting bad data propagate through the pipeline."

### Step 4: Define Webhook Payload Models (app/models/webhook_payloads.py)

**What we did:** Created typed models for GitHub's webhook JSON payloads.

**Why type the webhook payload?**
GitHub sends complex nested JSON. Without types, you'd write:
```python
sha = payload["pull_request"]["head"]["sha"]  # Easy to typo, no autocomplete
```
With Pydantic models:
```python
event = PullRequestEvent(**payload)
sha = event.pull_request.head.sha  # Autocomplete, type-checked
```

We didn't use these models in the final webhook handler (we used raw dict access for
simplicity), but they're available for stricter validation later.

### Step 5: Create FastAPI Skeleton (app/main.py)

**What we did:** Created the FastAPI application with a `/health` endpoint.

```python
app = FastAPI(title="Ninja Code Guard", version="0.1.0")

@app.get("/health")
async def health_check():
    return {"status": "ok", "service": "Ninja Code Guard", "version": "0.1.0"}
```

**Why a /health endpoint?**
- **Render.com** uses it to know if your service is alive (configured in render.yaml)
- **GitHub Actions cron** pings it every 10 minutes to prevent cold starts
- **The dashboard** calls it to show service status
- **Load balancers** (if you scale up) use it to route traffic only to healthy instances

### Step 6: Provision External Services

**What we did:** Created accounts and obtained credentials for all external services.

#### 6a. GitHub App β€” "Ninja's Code Guard"

**Where:** github.com/settings/apps/new

**What we configured:**
| Setting | Value | Reason |
|---------|-------|--------|
| Name | Ninja Code Guard | Bot identity: `ninjas-code-guard[bot]` |
| Homepage URL | github.com/ninjacode911/codeprobe | Points to our repo |
| Webhook Active | Yes | We need to receive PR events |
| Webhook Secret | (generated with `python -c "import secrets; print(secrets.token_hex(32))"`) | HMAC authentication |
| Contents | Read | Fetch full file source code for RAG context |
| Pull requests | Read & Write | Read diffs, post review comments |
| Commit statuses | Write | Show health score as commit status check |
| Metadata | Read | Required β€” basic repo info |
| Events | pull_request, pull_request_review_comment | Our trigger events |
| Install target | Only this account | Dev-mode only for now |

**What we got:**
- App ID: 3133457
- Private Key: `.pem` file saved to `keys/ninja-s-code-guard.2026-03-19.private-key.pem`
- Webhook Secret: saved to `.env`

**How GitHub App authentication works (important concept):**
```
Step 1: Sign a JWT with our private key (.pem)
        JWT payload = {iss: APP_ID, iat: now, exp: now+9min}
        Signed with RS256 (RSA + SHA-256)
        This proves: "I am the Ninja Code Guard app"

Step 2: Exchange JWT for an installation access token
        POST /app/installations/{id}/access_tokens
        Headers: Authorization: Bearer <JWT>
        Returns: token valid for 1 hour, scoped to installed repos
        This proves: "I can access ninjacode911's repos"

Step 3: Use installation token for all API calls
        GET /repos/ninjacode911/codeguard-test/pulls/1
        Headers: Authorization: token <installation_token>
```

#### 6b. Groq API

**Where:** console.groq.com
**What:** API key for Llama-3.1-70B inference (14,400 free requests/day)
**Saved as:** `GROQ_API_KEY` in `.env`

#### 6c. Neon.tech Postgres

**Where:** console.neon.tech
**What:** Serverless Postgres database (512MB free tier)
**Saved as:** `DATABASE_URL` in `.env`
**Used for:** Storing PR review history, health score trends, finding details

#### 6d. Upstash Redis

**Where:** console.upstash.com
**What:** Serverless Redis (10K requests/day free tier)
**Saved as:** `UPSTASH_REDIS_URL` in `.env`
**Used for:** Caching reviewed commit SHAs to prevent duplicate analysis

### Step 7: Create Configuration Files

#### .env.example
Template showing all required environment variables without actual values.
Committed to git so new developers know what to configure.

#### .gitignore
Prevents sensitive files from being committed:
- `.env` (contains API keys)
- `keys/` (contains private key .pem)
- `__pycache__/`, `.venv/` (generated files)
- `chroma_data/` (vector store data)
- `dashboard/node_modules/`, `dashboard/.next/` (Node.js generated)

#### pyproject.toml
Project metadata + tool configuration:
- `[tool.ruff]` β€” Python linter settings
- `[tool.pytest]` β€” Test configuration (asyncio mode, test paths)
- `[tool.mypy]` β€” Type checker settings

#### render.yaml
Render.com deployment configuration:
```yaml
services:
  - type: web
    name: ninja-code-guard
    buildCommand: pip install -r requirements.txt
    startCommand: uvicorn app.main:app --host 0.0.0.0 --port $PORT
    healthCheckPath: /health
    plan: free
```

#### sentinel.yml.example
Per-repo configuration template that users place in their repo root:
```yaml
agents:
  security: true
  performance: true
  style: true
min_severity: low
min_confidence: 0.6
exclude:
  - "vendor/"
  - "node_modules/"
```

### Step 8: Set Up CI/CD (GitHub Actions)

**Created two workflows:**

#### ci.yml β€” Runs on every push/PR
```yaml
steps:
  - Lint with ruff (catches style/import issues)
  - Type check with mypy (catches type errors)
  - Run tests with pytest
```

#### prewarm.yml β€” Cron job every 10 minutes on weekdays
```yaml
schedule: "*/10 6-20 * * 1-5"  # Every 10min, 6am-8pm UTC, Mon-Fri
steps:
  - curl the /health endpoint to prevent Render cold starts
```

**Why pre-warm?** Render's free tier spins down after 15 minutes of inactivity. The first
request after spindown takes ~30 seconds (cold start). By pinging /health every 10 minutes
during working hours, the service stays warm and responds instantly to webhooks.

### Step 9: Write Initial Tests

**Created:** `tests/unit/test_findings_schema.py` β€” 8 tests for data model validation

These tests verify:
- Valid Finding objects are accepted
- Invalid agent types are rejected
- Invalid severity levels are rejected
- Confidence must be between 0.0 and 1.0
- CWE ID is optional (None allowed)
- Health score must be 0-100
- Invalid recommendation values are rejected

---

## Files Created in Week 1

| File | Purpose |
|------|---------|
| `app/__init__.py` | Makes app a Python package |
| `app/config.py` | Centralized configuration via environment variables |
| `app/main.py` | FastAPI app with /health endpoint (expanded in Week 2) |
| `app/models/__init__.py` | Models package |
| `app/models/findings.py` | Finding, SynthesizedReview, PRReviewRecord schemas |
| `app/models/webhook_payloads.py` | GitHub webhook event payload types |
| `tests/conftest.py` | Shared test fixtures (sample finding data) |
| `tests/unit/test_findings_schema.py` | 8 schema validation tests |
| `.env` | Environment variables (gitignored β€” contains secrets) |
| `.env.example` | Template for .env (committed β€” no secrets) |
| `.gitignore` | Files to exclude from git |
| `pyproject.toml` | Project metadata + tool configs |
| `requirements.txt` | Python production dependencies |
| `requirements-dev.txt` | Dev/test dependencies |
| `render.yaml` | Render.com deployment config |
| `sentinel.yml.example` | Per-repo config template |
| `.github/workflows/ci.yml` | CI pipeline (lint + test) |
| `.github/workflows/prewarm.yml` | Render pre-warm cron |
| `keys/.gitignore` | Prevents .pem files from being committed |
| `PROJECT_PLAN.md` | Master project plan + progress tracker |

---

## Key Decisions Made

| Decision | Rationale |
|----------|-----------|
| Pydantic for all data models | Runtime validation + IDE autocomplete + auto-docs |
| pydantic-settings for config | Type-safe env vars, auto-loads .env, 12-factor pattern |
| FastAPI (not Flask/Django) | Async-native (needed for parallel agents), auto OpenAPI docs, modern Python |
| GitHub App (not Action) | One deployment serves all repos, webhook-driven, own bot identity |
| Upstash Redis (not in-memory cache) | Persists across Render restarts, shared across workers |
| Neon.tech (not SQLite) | Serverless, accessible from dashboard, persistent storage |

---

*Documentation written 2026-03-19 as part of Week 1 completion.*