ninja-code-guard / README.md
NinjainPJs's picture
Add HF Spaces metadata to README
fd8fc09
metadata
title: Ninja Code Guard
emoji: πŸ›‘οΈ
colorFrom: purple
colorTo: blue
sdk: docker
app_port: 7860

Ninja Code Guard

Multi-agent code review system that reviews GitHub pull requests the way a senior engineering team would.

Three specialized AI agents β€” Security, Performance, and Style β€” analyze your code in parallel, then a Synthesizer merges their findings into a single, prioritized, non-overlapping review with inline GitHub comments.

Screenshots

Screenshots available in the GitHub repository β€” PR review comments, dashboard home, repo detail with health trends, and PR findings table.

How It Works

PR opened on GitHub
        β”‚
        β–Ό
   Webhook received ──→ HMAC-SHA256 validated
        β”‚
        β–Ό
   Redis cache check ──→ Skip if already reviewed
        β”‚
        β–Ό
   Fetch PR data ──→ Diff + full file contents
        β”‚
        β–Ό
   RAG Context ──→ Embed files β†’ ChromaDB β†’ Retrieve related code
        β”‚
        β–Ό
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚     3 Agents run IN PARALLEL            β”‚
   β”‚  πŸ”’ Security  ⚑ Performance  ✏️ Style  β”‚
   β”‚  Bandit+LLM    Radon+LLM     Ruff+LLM  β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 β”‚
                 β–Ό
   Synthesizer ──→ Deduplicate β†’ Rank β†’ Score β†’ Summarize
        β”‚
        β–Ό
   Post to GitHub ──→ Inline comments + Summary with Health Score

What Each Agent Does

Agent Focus Static Tools Example Findings
πŸ”’ Security Vulnerabilities, auth, secrets Bandit, detect-secrets SQL injection, hardcoded API keys, weak crypto
⚑ Performance Efficiency, scalability Radon complexity N+1 queries, O(n²) loops, blocking I/O
✏️ Style Readability, maintainability Ruff linter Unused imports, bad naming, dead code
🧠 Synthesizer Merge & prioritize β€” Deduplication, conflict resolution, Health Score

Tech Stack

Layer Technology Why
LLM Groq (Llama-3.3-70B) 500+ tokens/sec, free 14.4K req/day
Agents LangChain + Structured Output Typed JSON responses, prompt templates
Backend FastAPI Async, auto OpenAPI docs
Vector DB ChromaDB + sentence-transformers RAG context, semantic code search
Cache Upstash Redis Prevent duplicate reviews
Database Neon Postgres Review history, Health Score trends
Dashboard Next.js on Vercel Review history, trend charts
GitHub GitHub App (webhooks) Inline PR comments, bot identity

Quick Start

Prerequisites

  • Python 3.11+
  • Groq API key (free at console.groq.com)
  • GitHub App (registered at github.com/settings/apps)

Setup

git clone https://github.com/ninjacode911/Project-Ninja-Code-Guard
cd Project-Ninja-Code-Guard
python -m venv .venv && source .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install -r requirements.txt

cp .env.example .env
# Edit .env with your API keys

uvicorn app.main:app --reload --port 8000

Architecture

4 Layers:

  • GitHub Layer β€” Webhooks, PR events, inline comments
  • Orchestration Layer β€” FastAPI, agent dispatch, asyncio.gather
  • Agent Layer β€” 3 domain agents + synthesizer (LangChain)
  • Knowledge Layer β€” ChromaDB (RAG), Redis (cache), Postgres (history)

Key Design Patterns:

  • Template Method β€” All agents share a base class, override only prompt + tools
  • Structured Output β€” LLM constrained to return valid JSON (Pydantic schema)
  • Fail-Open Cache β€” If Redis is down, proceed with analysis
  • Background Tasks β€” Return 200 to GitHub immediately, review asynchronously
  • Parallel Execution β€” asyncio.gather runs 3 agents concurrently

Running Tests

pytest tests/unit/ -v  # 92 tests

Project Structure

app/
  agents/          # Security, Performance, Style, Synthesizer
  tools/           # Bandit, detect-secrets, Radon, Ruff wrappers
  context/         # RAG pipeline (embedder, indexer, retriever)
  github/          # Webhook validation, API client, comment formatter
  models/          # Pydantic schemas (Finding, SynthesizedReview)
  db/              # Redis cache, Postgres queries
  services/        # Health Score calculator
dashboard/         # Next.js frontend (Vercel)
tests/             # Unit tests + evaluation harness
prompts/           # Agent system prompts (Markdown)
docs/              # Week-by-week documentation

Documentation

Detailed week-by-week documentation in docs/:

License

Apache 2.0


Built by ninjacode911