Spaces:

build-small-hackathon
/

DiffSense

Runtime error

App Files Files Community

avaliev Codex commited on 18 days ago

Commit

3a679f6

1 Parent(s): 5f029e1

Build DiffSense Gradio reviewer

Browse files

Co-authored-by: Codex <codex@openai.com>

Files changed (4) hide show

README.md +117 -6
TECH_DESIGN.md +116 -0
app.py +701 -48
requirements.txt +2 -0

README.md CHANGED Viewed

@@ -1,17 +1,128 @@
 ---
 title: DiffSense
-emoji: 💬
-colorFrom: yellow
-colorTo: purple
 sdk: gradio
 sdk_version: 6.5.1
 app_file: app.py
 pinned: false
 hf_oauth: true
 hf_oauth_scopes:
-- inference-api
 license: mit
-short_description: On-Device Pull Request & Code Review Assistant
 ---
-An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).

 ---
 title: DiffSense
+emoji: 🔎
+colorFrom: gray
+colorTo: yellow
 sdk: gradio
 sdk_version: 6.5.1
 app_file: app.py
 pinned: false
 hf_oauth: true
 hf_oauth_scopes:
+  - inference-api
 license: mit
+short_description: Private PR review for local AI teams.
+tags:
+  - build-small
+  - gradio
+  - code-review
+  - local-ai
+  - backyard-ai
+  - best-use-of-codex
+  - best-agent
+  - off-brand
+  - best-demo
+models:
+  - JetBrains/Mellum-2-12B-instruct
 ---
+# DiffSense
+Private, offline-first pull request review for teams that cannot send proprietary code to cloud review bots.
+Paste a unified diff or a public GitHub PR URL and DiffSense returns severity-tagged findings, inline comments, and structured JSON that can be copied into a PR review. The prototype works without a GPU by using deterministic review rules, then optionally adds a small-model summary through Hugging Face OAuth.
+## Why We Built It
+Code review is one of the highest-leverage daily engineering workflows, but most AI reviewers require sending private code to a hosted SaaS. That is a deal-breaker for teams working with customer data, internal APIs, security-sensitive systems, or unreleased products.
+DiffSense is the small-model version of that workflow: useful immediately, inspectable, and designed so the core review loop can run locally.
+## What Works Now
+- Unified diff parser with file and hunk awareness.
+- Inline custom diff viewer built in Gradio.
+- Deterministic review findings for security, logic, maintainability, and test risks.
+- Public GitHub PR URL fetching through the PR `.diff` endpoint.
+- Structured JSON output with file, hunk, line, severity, category, comment, and suggestion.
+- Optional model-assisted summary using `JetBrains/Mellum-2-12B-instruct` through the Hugging Face Inference API when OAuth is available.
+## Hackathon Track
+DiffSense is entered in the Backyard AI track: a practical tool for developers that solves a real daily problem.
+Prize/badge targets:
+- Best Use of Codex: Codex is being used as an active build partner and will be credited in commits.
+- Best Agent: the product is structured as a review pipeline: parse, classify, review, summarize, render.
+- Off Brand: the app uses a custom Gradio interface instead of the default chat UI.
+- Best Demo: the workflow is easy to show in under two minutes with a real risky diff.
+## Planned Model Stack
+All planned models are under the Build Small 32B parameter cap.
+| Role | Model | Status |
+| --- | --- | --- |
+| Code review summary | JetBrains Mellum 2 12B Instruct | Optional HF inference hook implemented |
+| Provider | Hugging Face Inference API | Optional OAuth-backed summary provider |
+| Agentic routing | NVIDIA Nemotron 3 Nano | Planned extension, not submitted as current eligibility |
+| Visual PR context | OpenBMB MiniCPM-V 4.6 | Planned extension, not submitted as current eligibility |
+| Runtime | Modal | Planned extension, not submitted as current eligibility |
+The current app intentionally keeps a deterministic fallback so the demo remains reliable even if a hosted model endpoint is cold, rate-limited, or unavailable.
+## Usage
+1. Open the Space.
+2. Paste a unified diff, paste a public GitHub PR URL, or click **Load sample diff**.
+3. Click **Review diff**.
+4. Read the inline comments and copy the structured JSON into your PR workflow.
+For public GitHub PRs, paste the PR URL directly. DiffSense fetches the `.diff` version with a short timeout.
+## Output Shape
+```json
+{
+  "file": "src/auth.py",
+  "hunk": "@@ -1,9 +1,13 @@",
+  "line": 11,
+  "severity": "critical",
+  "category": "security",
+  "comment": "The change disables a verification check, which can turn a trusted boundary into a bypass.",
+  "suggestion": "Keep verification enabled and add a narrowly scoped test fixture for local development.",
+  "source": "deterministic"
+}
+```
+## Privacy
+The deterministic review path runs inside the app process and does not send the pasted diff to any external model. If a public PR URL is pasted, the app fetches its public `.diff` over the network. If the optional model summary is enabled, the diff excerpt and deterministic findings are sent to the selected Hugging Face Inference model using the signed-in user's OAuth token.
+## Local Run
+```bash
+pip install -r requirements.txt
+python app.py
+```
+Then open `http://localhost:7860`.
+## Demo Script
+1. Start with the privacy pain: cloud review bots are useful, but private code cannot always leave the machine.
+2. Load the sample diff.
+3. Show critical findings: hardcoded secret, disabled JWT verification, insecure pickle load, disabled TLS verification.
+4. Show the JSON output as a practical artifact for PR automation.
+5. Toggle the optional model summary to show the small-model enhancement path.
+## Social Post Draft
+DiffSense is our Build Small hackathon project: a private PR reviewer for teams that cannot send proprietary code to cloud bots.
+Paste a diff or public PR URL, get inline severity-tagged review comments and structured JSON. The app works offline first for pasted diffs, with optional small-model summarization through Mellum 2.
+Built with Gradio, Codex, and open-weight model targets under 32B.
+#BuildSmall #HuggingFace #Gradio #LocalAI #CodeReview

TECH_DESIGN.md ADDED Viewed

	@@ -0,0 +1,116 @@

+# DiffSense Technical Design
+## Goal
+Build a useful, demoable, privacy-first pull request reviewer for the Build Small hackathon. The app must work reliably inside a Gradio Space and stay eligible for the under-32B model constraint.
+The implementation is intentionally offline-first: deterministic review rules provide the core value, and small-model inference is an optional enhancement rather than a single point of failure.
+## Current Shipped Prototype
+```text
+Unified diff input or public GitHub PR URL
+  -> stdlib diff parser
+  -> deterministic review engine
+  -> structured findings
+  -> custom Gradio HTML diff viewer
+  -> optional Mellum 2 summary via HF OAuth
+```
+## Components
+### Gradio UI
+File: `app.py`
+- Uses `gr.Blocks` instead of the default chatbot scaffold.
+- Provides a sample risky diff for a one-click demo.
+- Accepts pasted unified diffs and public GitHub PR URLs.
+- Renders an inline diff view with file headers, hunk headers, line numbers, severity badges, comments, and suggested fixes.
+- Shows structured JSON for automation and judge inspection.
+### Diff Parser
+The input layer fetches public GitHub PR URLs through their `.diff` endpoint with a short timeout. Pasted diffs are handled entirely in-process.
+The parser handles standard unified diffs:
+- `diff --git` file boundaries.
+- `+++ b/path` file names.
+- `@@ -old,+new @@` hunk headers.
+- Added, removed, and context lines with old/new line numbers.
+No external parser is required, which keeps startup fast and dependency risk low.
+### Review Engine
+The deterministic engine checks added lines for high-signal review risks:
+- Hardcoded credentials.
+- Disabled verification such as TLS or JWT signature checks.
+- Unsafe deserialization with `pickle`.
+- Dynamic execution through `eval` or `exec`.
+- `shell=True` command execution.
+- SQL string interpolation.
+- Bare `except:`.
+- Temporary `TODO`, `FIXME`, or `HACK` markers.
+- Return-contract changes such as newly introduced `return None`.
+Each finding includes:
+```json
+{
+  "file": "src/auth.py",
+  "hunk": "@@ -1,9 +1,13 @@",
+  "line": 11,
+  "severity": "critical",
+  "category": "security",
+  "comment": "The change disables a verification check, which can turn a trusted boundary into a bypass.",
+  "suggestion": "Keep verification enabled and add a narrowly scoped test fixture for local development.",
+  "source": "deterministic"
+}
+```
+### Optional Model Summary
+When enabled, the app uses the signed-in Hugging Face OAuth token or `HF_TOKEN` through the Hugging Face Inference API to call:
+```text
+JetBrains/Mellum-2-12B-instruct
+```
+The model is asked to summarize the deterministic findings rather than invent new findings. This keeps the model role narrow, fast, and auditable.
+## Hackathon Fit
+Required criteria:
+- Under 32B: current optional model target is 12B; planned sponsor models are also under 32B.
+- Gradio app: implemented in `app.py`.
+- README tags: included in `README.md` front matter.
+- Demo-friendly: built-in sample diff produces multiple clear findings without setup.
+Prize positioning:
+- Backyard AI: practical developer workflow.
+- Best Use of Codex: Codex is actively building and shaping the repo.
+- Best Agent: staged review pipeline with parsing, classification, review, and summary.
+- Off Brand: custom HTML diff UI instead of stock chat.
+- Best Demo: one-click sample with visible before/after review value.
+## Planned Extensions
+These should only be added after the current app is deployed and recorded:
+1. Add Modal endpoint for open-weight Mellum inference.
+2. Add MiniCPM-V image upload for PR screenshots and architecture diagrams.
+3. Add Nemotron router only if there is enough time to make it real and visible.
+4. Generate patch suggestions as downloadable `.patch` files.
+## Risk Controls
+- The app remains useful without model availability.
+- Dependencies are limited to Gradio and `huggingface_hub`.
+- No pasted diff is sent externally unless the user explicitly enables the model summary.
+- Public PR URLs are fetched as public `.diff` documents; private code should be pasted only when the model summary is off.
+- The sample diff demonstrates value even during GPU/API outages.

app.py CHANGED Viewed

@@ -1,69 +1,722 @@
 import gradio as gr
 from huggingface_hub import InferenceClient
-def respond(
-    message,
-    history: list[dict[str, str]],
-    system_message,
-    max_tokens,
-    temperature,
-    top_p,
-    hf_token: gr.OAuthToken,
-):
-    """
-    For more information on `huggingface_hub` Inference API support, please check the docs: https://huggingface.co/docs/huggingface_hub/v0.22.2/en/guides/inference
-    """
-    client = InferenceClient(token=hf_token.token, model="openai/gpt-oss-20b")
-    messages = [{"role": "system", "content": system_message}]
-    messages.extend(history)
-    messages.append({"role": "user", "content": message})
-    response = ""
-    for message in client.chat_completion(
-        messages,
-        max_tokens=max_tokens,
-        stream=True,
-        temperature=temperature,
-        top_p=top_p,
-    ):
-        choices = message.choices
-        token = ""
-        if len(choices) and choices[0].delta.content:
-            token = choices[0].delta.content
-        response += token
-        yield response
 """
-For information on how to customize the ChatInterface, peruse the gradio docs: https://www.gradio.app/docs/chatinterface
-"""
-chatbot = gr.ChatInterface(
-    respond,
-    additional_inputs=[
-        gr.Textbox(value="You are a friendly Chatbot.", label="System message"),
-        gr.Slider(minimum=1, maximum=2048, value=512, step=1, label="Max new tokens"),
-        gr.Slider(minimum=0.1, maximum=4.0, value=0.7, step=0.1, label="Temperature"),
-        gr.Slider(
-            minimum=0.1,
-            maximum=1.0,
-            value=0.95,
-            step=0.05,
-            label="Top-p (nucleus sampling)",
-        ),
-    ],
 )
 with gr.Blocks() as demo:
     with gr.Sidebar():
         gr.LoginButton()
-    chatbot.render()
 if __name__ == "__main__":
-    demo.launch()

+from __future__ import annotations
+import html
+import json
+import os
+import re
+from dataclasses import dataclass, field
+from typing import Any
+from urllib.parse import urlparse
+from urllib.request import Request, urlopen
 import gradio as gr
 from huggingface_hub import InferenceClient
+DEFAULT_MODEL = os.getenv("DIFFSENSE_MODEL", "JetBrains/Mellum-2-12B-instruct")
+FETCH_TIMEOUT_SECONDS = 10
+CSS = """
+:root {
+  --ink: #111827;
+  --muted: #64748b;
+  --paper: #f8fafc;
+  --line: #d8dee9;
+  --add-bg: #ecfdf3;
+  --add-ink: #166534;
+  --del-bg: #fff1f2;
+  --del-ink: #9f1239;
+  --warn: #b45309;
+  --crit: #be123c;
+  --nit: #475569;
+}
+.gradio-container {
+  max-width: 1280px !important;
+}
+#hero {
+  border-bottom: 1px solid var(--line);
+  padding: 18px 0 14px;
+  margin-bottom: 18px;
+}
+#hero h1 {
+  color: var(--ink);
+  font-size: 36px;
+  line-height: 1.05;
+  margin: 0;
+  letter-spacing: 0;
+}
+#hero p {
+  color: var(--muted);
+  margin: 8px 0 0;
+  font-size: 15px;
+}
+.score-grid {
+  display: grid;
+  grid-template-columns: repeat(4, minmax(0, 1fr));
+  gap: 10px;
+  margin: 12px 0 18px;
+}
+.score-card {
+  background: white;
+  border: 1px solid var(--line);
+  border-radius: 8px;
+  padding: 12px;
+}
+.score-label {
+  color: var(--muted);
+  font-size: 12px;
+  text-transform: uppercase;
+}
+.score-value {
+  color: var(--ink);
+  font-size: 24px;
+  font-weight: 700;
+  margin-top: 2px;
+}
+.diff-wrap {
+  background: white;
+  border: 1px solid var(--line);
+  border-radius: 8px;
+  overflow: hidden;
+}
+.file-title {
+  background: #0f172a;
+  color: white;
+  font: 700 13px ui-monospace, SFMono-Regular, Menlo, monospace;
+  padding: 10px 12px;
+}
+.hunk-title {
+  background: #e0f2fe;
+  color: #075985;
+  font: 700 12px ui-monospace, SFMono-Regular, Menlo, monospace;
+  padding: 7px 12px;
+  border-top: 1px solid var(--line);
+}
+.line {
+  display: grid;
+  grid-template-columns: 54px 1fr;
+  min-height: 26px;
+  border-top: 1px solid #eef2f7;
+  font: 13px/1.55 ui-monospace, SFMono-Regular, Menlo, monospace;
+}
+.line-no {
+  color: #94a3b8;
+  background: #f8fafc;
+  border-right: 1px solid #eef2f7;
+  padding: 3px 8px;
+  text-align: right;
+  user-select: none;
+}
+.line-code {
+  white-space: pre-wrap;
+  overflow-wrap: anywhere;
+  padding: 3px 10px;
+}
+.line.add .line-code {
+  background: var(--add-bg);
+  color: var(--add-ink);
+}
+.line.del .line-code {
+  background: var(--del-bg);
+  color: var(--del-ink);
+}
+.finding {
+  border-top: 1px solid var(--line);
+  padding: 10px 12px 12px 66px;
+  background: #fff7ed;
+}
+.finding.critical {
+  background: #fff1f2;
+}
+.finding.nitpick {
+  background: #f8fafc;
+}
+.badge {
+  border-radius: 999px;
+  color: white;
+  display: inline-block;
+  font-size: 11px;
+  font-weight: 700;
+  margin-right: 6px;
+  padding: 2px 8px;
+  text-transform: uppercase;
+}
+.badge.critical { background: var(--crit); }
+.badge.warning { background: var(--warn); }
+.badge.nitpick { background: var(--nit); }
+.category {
+  color: var(--muted);
+  font-size: 12px;
+  font-weight: 700;
+  text-transform: uppercase;
+}
+.finding-body {
+  color: var(--ink);
+  margin-top: 6px;
+}
+.suggestion {
+  color: #334155;
+  margin-top: 5px;
+}
+.empty-state {
+  background: white;
+  border: 1px dashed var(--line);
+  border-radius: 8px;
+  color: var(--muted);
+  padding: 18px;
+}
+@media (max-width: 760px) {
+  .score-grid { grid-template-columns: repeat(2, minmax(0, 1fr)); }
+  #hero h1 { font-size: 28px; }
+  .line { grid-template-columns: 42px 1fr; font-size: 12px; }
+  .finding { padding-left: 52px; }
+}
 """
+SAMPLE_DIFF = "\n".join(
+    [
+        "diff --git a/src/auth.py b/src/auth.py",
+        "index 54d88cd..b2a1772 100644",
+        "--- a/src/auth.py",
+        "+++ b/src/auth.py",
+        "@@ -1,9 +1,13 @@",
+        " import jwt",
+        "+import pickle",
+        " import requests",
+        '+SECRET = "dev-secret-token"',
+        " ",
+        " def load_user(raw):",
+        "+    user = pickle.loads(raw)",
+        "+    return user",
+        "+",
+        " def verify(token):",
+        '-    return jwt.decode(token, SECRET, algorithms=["HS256"])',
+        '+    return jwt.decode(token, SECRET, algorithms=["HS256"], options={"verify_signature": False})',
+        " ",
+        " def fetch_profile(url):",
+        "-    return requests.get(url).json()",
+        "+    return requests.get(url, verify=False).json()",
+        "diff --git a/src/report.py b/src/report.py",
+        "index 7471fee..db2ab78 100644",
+        "--- a/src/report.py",
+        "+++ b/src/report.py",
+        "@@ -8,8 +8,10 @@ def build_query(user_id):",
+        '-    return "select * from events where user_id = " + user_id',
+        '+    return f"select * from events where user_id = {user_id}"',
+        " ",
+        " def summarize(items):",
+        "+    if len(items) == 0:",
+        "+        return None",
+        '     total = 0',
+        '     for item in items:',
+        '         total += item["amount"]',
+        "     return total / len(items)",
+    ]
 )
+@dataclass
+class DiffLine:
+    kind: str
+    text: str
+    old_no: int | None = None
+    new_no: int | None = None
+@dataclass
+class Hunk:
+    header: str
+    old_start: int
+    new_start: int
+    lines: list[DiffLine] = field(default_factory=list)
+@dataclass
+class FileDiff:
+    path: str
+    hunks: list[Hunk] = field(default_factory=list)
+@dataclass
+class Finding:
+    file: str
+    hunk: str
+    line: int | None
+    severity: str
+    category: str
+    comment: str
+    suggestion: str
+    source: str = "deterministic"
+RULES: list[dict[str, Any]] = [
+    {
+        "pattern": re.compile(r"(password|passwd|secret|token|api[_-]?key)\s*=\s*['\"][^'\"]{6,}", re.I),
+        "severity": "critical",
+        "category": "security",
+        "comment": "A credential-like value is being committed in the diff.",
+        "suggestion": "Move the value to a secret manager or environment variable and rotate the exposed secret.",
+    },
+    {
+        "pattern": re.compile(r"verify_signature['\"]?\s*:\s*False|verify\s*=\s*False", re.I),
+        "severity": "critical",
+        "category": "security",
+        "comment": "The change disables a verification check, which can turn a trusted boundary into a bypass.",
+        "suggestion": "Keep verification enabled and add a narrowly scoped test fixture for local development.",
+    },
+    {
+        "pattern": re.compile(r"\bpickle\.loads?\s*\(", re.I),
+        "severity": "critical",
+        "category": "security",
+        "comment": "Deserializing pickle data from an untrusted source can execute arbitrary code.",
+        "suggestion": "Use a safe format such as JSON or validate and sign the payload before deserialization.",
+    },
+    {
+        "pattern": re.compile(r"\beval\s*\(|\bexec\s*\(", re.I),
+        "severity": "critical",
+        "category": "security",
+        "comment": "Dynamic code execution appears in a changed line.",
+        "suggestion": "Replace dynamic execution with an explicit parser or allowlisted dispatch table.",
+    },
+    {
+        "pattern": re.compile(r"shell\s*=\s*True", re.I),
+        "severity": "critical",
+        "category": "security",
+        "comment": "Launching a shell with user-influenced input is command-injection prone.",
+        "suggestion": "Pass arguments as a list with shell disabled and validate each user-controlled argument.",
+    },
+    {
+        "pattern": re.compile(r"(f['\"].*(select|insert|update|delete)|(select|insert|update|delete).*(\+|format\s*\())", re.I),
+        "severity": "warning",
+        "category": "security",
+        "comment": "The SQL statement appears to be built with string interpolation.",
+        "suggestion": "Use parameterized queries so the database driver handles escaping and typing.",
+    },
+    {
+        "pattern": re.compile(r"except\s*:", re.I),
+        "severity": "warning",
+        "category": "logic",
+        "comment": "A bare except can hide interrupts and unrelated failures.",
+        "suggestion": "Catch the specific exception type and preserve the original error context.",
+    },
+    {
+        "pattern": re.compile(r"TODO|FIXME|HACK", re.I),
+        "severity": "nitpick",
+        "category": "maintainability",
+        "comment": "A temporary marker landed in changed code.",
+        "suggestion": "Link it to an issue or resolve it before merging.",
+    },
+]
+def normalize_diff(raw_input: str) -> str:
+    value = (raw_input or "").strip()
+    if not value:
+        return ""
+    parsed = urlparse(value)
+    if parsed.netloc == "github.com" and "/pull/" in parsed.path:
+        return fetch_public_diff(value)
+    if parsed.scheme in {"http", "https"} and value.endswith(".diff"):
+        return fetch_public_diff(value)
+    return value
+def fetch_public_diff(url: str) -> str:
+    diff_url = url if url.endswith(".diff") else f"{url.rstrip('/')}.diff"
+    request = Request(diff_url, headers={"User-Agent": "DiffSense/1.0"})
+    try:
+        with urlopen(request, timeout=FETCH_TIMEOUT_SECONDS) as response:
+            content_type = response.headers.get("content-type", "")
+            body = response.read(1_500_000).decode("utf-8", errors="replace")
+    except Exception as exc:
+        raise gr.Error(f"Could not fetch the public diff from {diff_url}: {exc}") from exc
+    if "@@ " not in body:
+        raise gr.Error(
+            f"Fetched {diff_url}, but it did not look like a unified diff "
+            f"(content-type: {content_type or 'unknown'})."
+        )
+    return body
+def parse_hunk_header(header: str) -> tuple[int, int]:
+    match = re.search(r"@@ -(?P<old>\d+)(?:,\d+)? \+(?P<new>\d+)(?:,\d+)? @@", header)
+    if not match:
+        return 0, 0
+    return int(match.group("old")), int(match.group("new"))
+def parse_unified_diff(diff_text: str) -> list[FileDiff]:
+    files: list[FileDiff] = []
+    current_file: FileDiff | None = None
+    current_hunk: Hunk | None = None
+    old_no = 0
+    new_no = 0
+    for raw_line in diff_text.splitlines():
+        if raw_line.startswith("diff --git "):
+            current_file = None
+            current_hunk = None
+            continue
+        if raw_line.startswith("+++ "):
+            path = raw_line[4:].strip()
+            if path.startswith("b/"):
+                path = path[2:]
+            current_file = FileDiff(path=path)
+            files.append(current_file)
+            current_hunk = None
+            continue
+        if raw_line.startswith("@@ "):
+            if current_file is None:
+                current_file = FileDiff(path="pasted.diff")
+                files.append(current_file)
+            old_start, new_start = parse_hunk_header(raw_line)
+            old_no = old_start
+            new_no = new_start
+            current_hunk = Hunk(header=raw_line, old_start=old_start, new_start=new_start)
+            current_file.hunks.append(current_hunk)
+            continue
+        if current_hunk is None:
+            continue
+        if raw_line.startswith("+") and not raw_line.startswith("+++"):
+            current_hunk.lines.append(DiffLine("add", raw_line[1:], new_no=new_no))
+            new_no += 1
+        elif raw_line.startswith("-") and not raw_line.startswith("---"):
+            current_hunk.lines.append(DiffLine("del", raw_line[1:], old_no=old_no))
+            old_no += 1
+        elif raw_line.startswith("\\"):
+            continue
+        else:
+            text = raw_line[1:] if raw_line.startswith(" ") else raw_line
+            current_hunk.lines.append(DiffLine("ctx", text, old_no=old_no, new_no=new_no))
+            old_no += 1
+            new_no += 1
+    return files
+def review_diff(files: list[FileDiff]) -> list[Finding]:
+    findings: list[Finding] = []
+    for file_diff in files:
+        for hunk in file_diff.hunks:
+            added_lines = [line for line in hunk.lines if line.kind == "add"]
+            removed_lines = [line for line in hunk.lines if line.kind == "del"]
+            for line in added_lines:
+                for rule in RULES:
+                    if rule["pattern"].search(line.text):
+                        findings.append(
+                            Finding(
+                                file=file_diff.path,
+                                hunk=hunk.header,
+                                line=line.new_no,
+                                severity=rule["severity"],
+                                category=rule["category"],
+                                comment=rule["comment"],
+                                suggestion=rule["suggestion"],
+                            )
+                        )
+            added_text = "\n".join(line.text for line in added_lines)
+            removed_text = "\n".join(line.text for line in removed_lines)
+            if re.search(r"return\s+None", added_text) and "Optional" not in added_text:
+                findings.append(
+                    Finding(
+                        file=file_diff.path,
+                        hunk=hunk.header,
+                        line=added_lines[0].new_no if added_lines else None,
+                        severity="warning",
+                        category="logic",
+                        comment="The new branch returns None, which may change the function's return contract.",
+                        suggestion="Return a neutral value of the same type or update callers and tests to handle None explicitly.",
+                    )
+                )
+            if "len(" in added_text and "/ len(" in removed_text:
+                findings.append(
+                    Finding(
+                        file=file_diff.path,
+                        hunk=hunk.header,
+                        line=added_lines[0].new_no if added_lines else None,
+                        severity="warning",
+                        category="test",
+                        comment="This change appears to address an empty collection path; make sure the regression is locked down.",
+                        suggestion="Add a test covering an empty input and a non-empty input for the same function.",
+                    )
+                )
+            if len(added_lines) >= 25 and not any("test" in file_diff.path.lower() for _ in [0]):
+                findings.append(
+                    Finding(
+                        file=file_diff.path,
+                        hunk=hunk.header,
+                        line=added_lines[0].new_no if added_lines else None,
+                        severity="nitpick",
+                        category="test",
+                        comment="This hunk adds a substantial amount of behavior outside a test file.",
+                        suggestion="Add or update a focused test that exercises the new branch.",
+                    )
+                )
+    return dedupe_findings(findings)
+def dedupe_findings(findings: list[Finding]) -> list[Finding]:
+    seen: set[tuple[str, str, int | None, str]] = set()
+    unique: list[Finding] = []
+    for finding in findings:
+        key = (finding.file, finding.category, finding.line, finding.comment)
+        if key not in seen:
+            seen.add(key)
+            unique.append(finding)
+    severity_order = {"critical": 0, "warning": 1, "nitpick": 2}
+    unique.sort(key=lambda item: (severity_order.get(item.severity, 9), item.file, item.line or 0))
+    return unique
+def summarize_with_model(
+    files: list[FileDiff],
+    findings: list[Finding],
+    enabled: bool,
+    hf_token: gr.OAuthToken | None = None,
+) -> str:
+    if not enabled:
+        return "Model summary disabled. Deterministic review completed locally in the app process."
+    token = hf_token.token if hf_token else os.getenv("HF_TOKEN")
+    if not token:
+        return "Model summary skipped: sign in with Hugging Face OAuth or set HF_TOKEN."
+    compact_diff = "\n".join(
+        f"{file.path}\n"
+        + "\n".join(
+            f"{hunk.header}\n"
+            + "\n".join(
+                f"{'+' if line.kind == 'add' else '-' if line.kind == 'del' else ' '} {line.text}"
+                for line in hunk.lines[:80]
+            )
+            for hunk in file.hunks[:4]
+        )
+        for file in files[:6]
+    )
+    deterministic = json.dumps([finding_to_dict(item) for item in findings[:12]], indent=2)
+    messages = [
+        {
+            "role": "system",
+            "content": (
+                "You are DiffSense, a terse senior code reviewer. Summarize the review risk in "
+                "four bullets. Do not invent findings beyond the provided deterministic findings."
+            ),
+        },
+        {
+            "role": "user",
+            "content": (
+                f"Deterministic findings:\n{deterministic}\n\n"
+                f"Diff excerpt:\n{compact_diff[:12000]}"
+            ),
+        },
+    ]
+    try:
+        client = InferenceClient(token=token, model=DEFAULT_MODEL)
+        response = client.chat_completion(
+            messages=messages,
+            max_tokens=320,
+            temperature=0.2,
+            top_p=0.9,
+        )
+        return response.choices[0].message.content or "Model returned an empty summary."
+    except Exception as exc:  # The app must stay demoable when endpoints are unavailable.
+        return f"Model summary unavailable from {DEFAULT_MODEL}: {exc}"
+def finding_to_dict(finding: Finding) -> dict[str, Any]:
+    return {
+        "file": finding.file,
+        "hunk": finding.hunk,
+        "line": finding.line,
+        "severity": finding.severity,
+        "category": finding.category,
+        "comment": finding.comment,
+        "suggestion": finding.suggestion,
+        "source": finding.source,
+    }
+def render_scoreboard(files: list[FileDiff], findings: list[Finding]) -> str:
+    hunk_count = sum(len(file.hunks) for file in files)
+    counts = {
+        "critical": sum(item.severity == "critical" for item in findings),
+        "warning": sum(item.severity == "warning" for item in findings),
+        "nitpick": sum(item.severity == "nitpick" for item in findings),
+    }
+    return f"""
+    <div class="score-grid">
+      <div class="score-card"><div class="score-label">Files</div><div class="score-value">{len(files)}</div></div>
+      <div class="score-card"><div class="score-label">Hunks</div><div class="score-value">{hunk_count}</div></div>
+      <div class="score-card"><div class="score-label">Critical</div><div class="score-value">{counts["critical"]}</div></div>
+      <div class="score-card"><div class="score-label">Warnings</div><div class="score-value">{counts["warning"]}</div></div>
+    </div>
+    """
+def render_review(files: list[FileDiff], findings: list[Finding]) -> str:
+    if not files:
+        return '<div class="empty-state">Paste a unified diff to see inline review findings.</div>'
+    findings_by_location: dict[tuple[str, str, int | None], list[Finding]] = {}
+    for finding in findings:
+        findings_by_location.setdefault((finding.file, finding.hunk, finding.line), []).append(finding)
+    chunks = [render_scoreboard(files, findings), '<div class="diff-wrap">']
+    for file_diff in files:
+        chunks.append(f'<div class="file-title">{html.escape(file_diff.path)}</div>')
+        for hunk in file_diff.hunks:
+            chunks.append(f'<div class="hunk-title">{html.escape(hunk.header)}</div>')
+            for line in hunk.lines:
+                number = line.new_no if line.kind == "add" else line.old_no
+                sign = "+" if line.kind == "add" else "-" if line.kind == "del" else " "
+                chunks.append(
+                    f'<div class="line {line.kind}">'
+                    f'<div class="line-no">{number if number is not None else ""}</div>'
+                    f'<div class="line-code">{html.escape(sign + line.text)}</div>'
+                    f"</div>"
+                )
+                for finding in findings_by_location.get((file_diff.path, hunk.header, line.new_no), []):
+                    chunks.append(render_finding(finding))
+            for finding in findings_by_location.get((file_diff.path, hunk.header, None), []):
+                chunks.append(render_finding(finding))
+    chunks.append("</div>")
+    return "\n".join(chunks)
+def render_finding(finding: Finding) -> str:
+    return f"""
+    <div class="finding {html.escape(finding.severity)}">
+      <span class="badge {html.escape(finding.severity)}">{html.escape(finding.severity)}</span>
+      <span class="category">{html.escape(finding.category)}</span>
+      <div class="finding-body">{html.escape(finding.comment)}</div>
+      <div class="suggestion"><strong>Fix:</strong> {html.escape(finding.suggestion)}</div>
+    </div>
+    """
+def run_review(
+    diff_input: str,
+    use_model_summary: bool,
+    hf_token: gr.OAuthToken | None = None,
+) -> tuple[str, list[dict[str, Any]], str]:
+    diff_text = normalize_diff(diff_input)
+    if not diff_text:
+        raise gr.Error("Paste a unified diff first, or load the sample diff.")
+    files = parse_unified_diff(diff_text)
+    if not files or not any(file.hunks for file in files):
+        raise gr.Error("I could not find unified diff hunks. Look for lines starting with @@.")
+    findings = review_diff(files)
+    summary = summarize_with_model(files, findings, use_model_summary, hf_token)
+    return render_review(files, findings), [finding_to_dict(item) for item in findings], summary
+def load_sample() -> str:
+    return SAMPLE_DIFF
+APP_THEME = gr.themes.Soft(primary_hue="slate", neutral_hue="slate")
 with gr.Blocks() as demo:
+    gr.HTML(
+        """
+        <div id="hero">
+          <h1>DiffSense</h1>
+          <p>Private, offline-first PR review for the Build Small hackathon. Paste a diff or public GitHub PR URL, get severity-tagged findings, keep your code out of SaaS review tools.</p>
+        </div>
+        """
+    )
     with gr.Sidebar():
         gr.LoginButton()
+        use_model_summary = gr.Checkbox(
+            value=False,
+            label="Add optional Mellum model summary",
+            info="Deterministic review works without network or GPU. OAuth/HF_TOKEN enables the sponsor-model summary.",
+        )
+        sample_btn = gr.Button("Load sample diff")
+    with gr.Row(equal_height=False):
+        with gr.Column(scale=5):
+            diff_input = gr.Textbox(
+                value="",
+                lines=24,
+                max_lines=32,
+                label="Unified diff or public GitHub PR URL",
+                placeholder="Paste a unified diff, paste https://github.com/org/repo/pull/123, or click Load sample diff.",
+                interactive=True,
+            )
+            run_btn = gr.Button("Review diff", variant="primary")
+        with gr.Column(scale=4):
+            summary_output = gr.Markdown(
+                value="Run a review to get the risk summary.",
+                label="Reviewer summary",
+            )
+            json_output = gr.JSON(label="Structured findings")
+    review_output = gr.HTML(
+        value='<div class="empty-state">Paste a unified diff or public GitHub PR URL, then click Review diff.</div>',
+        label="Inline diff review",
+    )
+    sample_btn.click(fn=load_sample, outputs=diff_input)
+    run_btn.click(
+        fn=run_review,
+        inputs=[diff_input, use_model_summary],
+        outputs=[review_output, json_output, summary_output],
+    )
 if __name__ == "__main__":
+    demo.launch(css=CSS, theme=APP_THEME)

requirements.txt ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ gradio[oauth]==6.5.1
2	+ huggingface_hub>=0.22.2