avaliev Codex commited on
Commit
3a679f6
·
1 Parent(s): 5f029e1

Build DiffSense Gradio reviewer

Browse files

Co-authored-by: Codex <codex@openai.com>

Files changed (4) hide show
  1. README.md +117 -6
  2. TECH_DESIGN.md +116 -0
  3. app.py +701 -48
  4. requirements.txt +2 -0
README.md CHANGED
@@ -1,17 +1,128 @@
1
  ---
2
  title: DiffSense
3
- emoji: 💬
4
- colorFrom: yellow
5
- colorTo: purple
6
  sdk: gradio
7
  sdk_version: 6.5.1
8
  app_file: app.py
9
  pinned: false
10
  hf_oauth: true
11
  hf_oauth_scopes:
12
- - inference-api
13
  license: mit
14
- short_description: On-Device Pull Request & Code Review Assistant
 
 
 
 
 
 
 
 
 
 
 
 
15
  ---
16
 
17
- An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: DiffSense
3
+ emoji: 🔎
4
+ colorFrom: gray
5
+ colorTo: yellow
6
  sdk: gradio
7
  sdk_version: 6.5.1
8
  app_file: app.py
9
  pinned: false
10
  hf_oauth: true
11
  hf_oauth_scopes:
12
+ - inference-api
13
  license: mit
14
+ short_description: Private PR review for local AI teams.
15
+ tags:
16
+ - build-small
17
+ - gradio
18
+ - code-review
19
+ - local-ai
20
+ - backyard-ai
21
+ - best-use-of-codex
22
+ - best-agent
23
+ - off-brand
24
+ - best-demo
25
+ models:
26
+ - JetBrains/Mellum-2-12B-instruct
27
  ---
28
 
29
+ # DiffSense
30
+
31
+ Private, offline-first pull request review for teams that cannot send proprietary code to cloud review bots.
32
+
33
+ Paste a unified diff or a public GitHub PR URL and DiffSense returns severity-tagged findings, inline comments, and structured JSON that can be copied into a PR review. The prototype works without a GPU by using deterministic review rules, then optionally adds a small-model summary through Hugging Face OAuth.
34
+
35
+ ## Why We Built It
36
+
37
+ Code review is one of the highest-leverage daily engineering workflows, but most AI reviewers require sending private code to a hosted SaaS. That is a deal-breaker for teams working with customer data, internal APIs, security-sensitive systems, or unreleased products.
38
+
39
+ DiffSense is the small-model version of that workflow: useful immediately, inspectable, and designed so the core review loop can run locally.
40
+
41
+ ## What Works Now
42
+
43
+ - Unified diff parser with file and hunk awareness.
44
+ - Inline custom diff viewer built in Gradio.
45
+ - Deterministic review findings for security, logic, maintainability, and test risks.
46
+ - Public GitHub PR URL fetching through the PR `.diff` endpoint.
47
+ - Structured JSON output with file, hunk, line, severity, category, comment, and suggestion.
48
+ - Optional model-assisted summary using `JetBrains/Mellum-2-12B-instruct` through the Hugging Face Inference API when OAuth is available.
49
+
50
+ ## Hackathon Track
51
+
52
+ DiffSense is entered in the Backyard AI track: a practical tool for developers that solves a real daily problem.
53
+
54
+ Prize/badge targets:
55
+
56
+ - Best Use of Codex: Codex is being used as an active build partner and will be credited in commits.
57
+ - Best Agent: the product is structured as a review pipeline: parse, classify, review, summarize, render.
58
+ - Off Brand: the app uses a custom Gradio interface instead of the default chat UI.
59
+ - Best Demo: the workflow is easy to show in under two minutes with a real risky diff.
60
+
61
+ ## Planned Model Stack
62
+
63
+ All planned models are under the Build Small 32B parameter cap.
64
+
65
+ | Role | Model | Status |
66
+ | --- | --- | --- |
67
+ | Code review summary | JetBrains Mellum 2 12B Instruct | Optional HF inference hook implemented |
68
+ | Provider | Hugging Face Inference API | Optional OAuth-backed summary provider |
69
+ | Agentic routing | NVIDIA Nemotron 3 Nano | Planned extension, not submitted as current eligibility |
70
+ | Visual PR context | OpenBMB MiniCPM-V 4.6 | Planned extension, not submitted as current eligibility |
71
+ | Runtime | Modal | Planned extension, not submitted as current eligibility |
72
+
73
+ The current app intentionally keeps a deterministic fallback so the demo remains reliable even if a hosted model endpoint is cold, rate-limited, or unavailable.
74
+
75
+ ## Usage
76
+
77
+ 1. Open the Space.
78
+ 2. Paste a unified diff, paste a public GitHub PR URL, or click **Load sample diff**.
79
+ 3. Click **Review diff**.
80
+ 4. Read the inline comments and copy the structured JSON into your PR workflow.
81
+
82
+ For public GitHub PRs, paste the PR URL directly. DiffSense fetches the `.diff` version with a short timeout.
83
+
84
+ ## Output Shape
85
+
86
+ ```json
87
+ {
88
+ "file": "src/auth.py",
89
+ "hunk": "@@ -1,9 +1,13 @@",
90
+ "line": 11,
91
+ "severity": "critical",
92
+ "category": "security",
93
+ "comment": "The change disables a verification check, which can turn a trusted boundary into a bypass.",
94
+ "suggestion": "Keep verification enabled and add a narrowly scoped test fixture for local development.",
95
+ "source": "deterministic"
96
+ }
97
+ ```
98
+
99
+ ## Privacy
100
+
101
+ The deterministic review path runs inside the app process and does not send the pasted diff to any external model. If a public PR URL is pasted, the app fetches its public `.diff` over the network. If the optional model summary is enabled, the diff excerpt and deterministic findings are sent to the selected Hugging Face Inference model using the signed-in user's OAuth token.
102
+
103
+ ## Local Run
104
+
105
+ ```bash
106
+ pip install -r requirements.txt
107
+ python app.py
108
+ ```
109
+
110
+ Then open `http://localhost:7860`.
111
+
112
+ ## Demo Script
113
+
114
+ 1. Start with the privacy pain: cloud review bots are useful, but private code cannot always leave the machine.
115
+ 2. Load the sample diff.
116
+ 3. Show critical findings: hardcoded secret, disabled JWT verification, insecure pickle load, disabled TLS verification.
117
+ 4. Show the JSON output as a practical artifact for PR automation.
118
+ 5. Toggle the optional model summary to show the small-model enhancement path.
119
+
120
+ ## Social Post Draft
121
+
122
+ DiffSense is our Build Small hackathon project: a private PR reviewer for teams that cannot send proprietary code to cloud bots.
123
+
124
+ Paste a diff or public PR URL, get inline severity-tagged review comments and structured JSON. The app works offline first for pasted diffs, with optional small-model summarization through Mellum 2.
125
+
126
+ Built with Gradio, Codex, and open-weight model targets under 32B.
127
+
128
+ #BuildSmall #HuggingFace #Gradio #LocalAI #CodeReview
TECH_DESIGN.md ADDED
@@ -0,0 +1,116 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # DiffSense Technical Design
2
+
3
+ ## Goal
4
+
5
+ Build a useful, demoable, privacy-first pull request reviewer for the Build Small hackathon. The app must work reliably inside a Gradio Space and stay eligible for the under-32B model constraint.
6
+
7
+ The implementation is intentionally offline-first: deterministic review rules provide the core value, and small-model inference is an optional enhancement rather than a single point of failure.
8
+
9
+ ## Current Shipped Prototype
10
+
11
+ ```text
12
+ Unified diff input or public GitHub PR URL
13
+ -> stdlib diff parser
14
+ -> deterministic review engine
15
+ -> structured findings
16
+ -> custom Gradio HTML diff viewer
17
+ -> optional Mellum 2 summary via HF OAuth
18
+ ```
19
+
20
+ ## Components
21
+
22
+ ### Gradio UI
23
+
24
+ File: `app.py`
25
+
26
+ - Uses `gr.Blocks` instead of the default chatbot scaffold.
27
+ - Provides a sample risky diff for a one-click demo.
28
+ - Accepts pasted unified diffs and public GitHub PR URLs.
29
+ - Renders an inline diff view with file headers, hunk headers, line numbers, severity badges, comments, and suggested fixes.
30
+ - Shows structured JSON for automation and judge inspection.
31
+
32
+ ### Diff Parser
33
+
34
+ The input layer fetches public GitHub PR URLs through their `.diff` endpoint with a short timeout. Pasted diffs are handled entirely in-process.
35
+
36
+ The parser handles standard unified diffs:
37
+
38
+ - `diff --git` file boundaries.
39
+ - `+++ b/path` file names.
40
+ - `@@ -old,+new @@` hunk headers.
41
+ - Added, removed, and context lines with old/new line numbers.
42
+
43
+ No external parser is required, which keeps startup fast and dependency risk low.
44
+
45
+ ### Review Engine
46
+
47
+ The deterministic engine checks added lines for high-signal review risks:
48
+
49
+ - Hardcoded credentials.
50
+ - Disabled verification such as TLS or JWT signature checks.
51
+ - Unsafe deserialization with `pickle`.
52
+ - Dynamic execution through `eval` or `exec`.
53
+ - `shell=True` command execution.
54
+ - SQL string interpolation.
55
+ - Bare `except:`.
56
+ - Temporary `TODO`, `FIXME`, or `HACK` markers.
57
+ - Return-contract changes such as newly introduced `return None`.
58
+
59
+ Each finding includes:
60
+
61
+ ```json
62
+ {
63
+ "file": "src/auth.py",
64
+ "hunk": "@@ -1,9 +1,13 @@",
65
+ "line": 11,
66
+ "severity": "critical",
67
+ "category": "security",
68
+ "comment": "The change disables a verification check, which can turn a trusted boundary into a bypass.",
69
+ "suggestion": "Keep verification enabled and add a narrowly scoped test fixture for local development.",
70
+ "source": "deterministic"
71
+ }
72
+ ```
73
+
74
+ ### Optional Model Summary
75
+
76
+ When enabled, the app uses the signed-in Hugging Face OAuth token or `HF_TOKEN` through the Hugging Face Inference API to call:
77
+
78
+ ```text
79
+ JetBrains/Mellum-2-12B-instruct
80
+ ```
81
+
82
+ The model is asked to summarize the deterministic findings rather than invent new findings. This keeps the model role narrow, fast, and auditable.
83
+
84
+ ## Hackathon Fit
85
+
86
+ Required criteria:
87
+
88
+ - Under 32B: current optional model target is 12B; planned sponsor models are also under 32B.
89
+ - Gradio app: implemented in `app.py`.
90
+ - README tags: included in `README.md` front matter.
91
+ - Demo-friendly: built-in sample diff produces multiple clear findings without setup.
92
+
93
+ Prize positioning:
94
+
95
+ - Backyard AI: practical developer workflow.
96
+ - Best Use of Codex: Codex is actively building and shaping the repo.
97
+ - Best Agent: staged review pipeline with parsing, classification, review, and summary.
98
+ - Off Brand: custom HTML diff UI instead of stock chat.
99
+ - Best Demo: one-click sample with visible before/after review value.
100
+
101
+ ## Planned Extensions
102
+
103
+ These should only be added after the current app is deployed and recorded:
104
+
105
+ 1. Add Modal endpoint for open-weight Mellum inference.
106
+ 2. Add MiniCPM-V image upload for PR screenshots and architecture diagrams.
107
+ 3. Add Nemotron router only if there is enough time to make it real and visible.
108
+ 4. Generate patch suggestions as downloadable `.patch` files.
109
+
110
+ ## Risk Controls
111
+
112
+ - The app remains useful without model availability.
113
+ - Dependencies are limited to Gradio and `huggingface_hub`.
114
+ - No pasted diff is sent externally unless the user explicitly enables the model summary.
115
+ - Public PR URLs are fetched as public `.diff` documents; private code should be pasted only when the model summary is off.
116
+ - The sample diff demonstrates value even during GPU/API outages.
app.py CHANGED
@@ -1,69 +1,722 @@
 
 
 
 
 
 
 
 
 
 
 
1
  import gradio as gr
2
  from huggingface_hub import InferenceClient
3
 
4
 
5
- def respond(
6
- message,
7
- history: list[dict[str, str]],
8
- system_message,
9
- max_tokens,
10
- temperature,
11
- top_p,
12
- hf_token: gr.OAuthToken,
13
- ):
14
- """
15
- For more information on `huggingface_hub` Inference API support, please check the docs: https://huggingface.co/docs/huggingface_hub/v0.22.2/en/guides/inference
16
- """
17
- client = InferenceClient(token=hf_token.token, model="openai/gpt-oss-20b")
18
 
19
- messages = [{"role": "system", "content": system_message}]
20
 
21
- messages.extend(history)
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
- messages.append({"role": "user", "content": message})
 
 
24
 
25
- response = ""
 
 
 
 
26
 
27
- for message in client.chat_completion(
28
- messages,
29
- max_tokens=max_tokens,
30
- stream=True,
31
- temperature=temperature,
32
- top_p=top_p,
33
- ):
34
- choices = message.choices
35
- token = ""
36
- if len(choices) and choices[0].delta.content:
37
- token = choices[0].delta.content
38
 
39
- response += token
40
- yield response
 
 
 
41
 
 
 
 
 
 
 
42
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
  """
44
- For information on how to customize the ChatInterface, peruse the gradio docs: https://www.gradio.app/docs/chatinterface
45
- """
46
- chatbot = gr.ChatInterface(
47
- respond,
48
- additional_inputs=[
49
- gr.Textbox(value="You are a friendly Chatbot.", label="System message"),
50
- gr.Slider(minimum=1, maximum=2048, value=512, step=1, label="Max new tokens"),
51
- gr.Slider(minimum=0.1, maximum=4.0, value=0.7, step=0.1, label="Temperature"),
52
- gr.Slider(
53
- minimum=0.1,
54
- maximum=1.0,
55
- value=0.95,
56
- step=0.05,
57
- label="Top-p (nucleus sampling)",
58
- ),
59
- ],
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
60
  )
61
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
62
  with gr.Blocks() as demo:
 
 
 
 
 
 
 
 
 
63
  with gr.Sidebar():
64
  gr.LoginButton()
65
- chatbot.render()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
66
 
67
 
68
  if __name__ == "__main__":
69
- demo.launch()
 
1
+ from __future__ import annotations
2
+
3
+ import html
4
+ import json
5
+ import os
6
+ import re
7
+ from dataclasses import dataclass, field
8
+ from typing import Any
9
+ from urllib.parse import urlparse
10
+ from urllib.request import Request, urlopen
11
+
12
  import gradio as gr
13
  from huggingface_hub import InferenceClient
14
 
15
 
16
+ DEFAULT_MODEL = os.getenv("DIFFSENSE_MODEL", "JetBrains/Mellum-2-12B-instruct")
17
+ FETCH_TIMEOUT_SECONDS = 10
 
 
 
 
 
 
 
 
 
 
 
18
 
 
19
 
20
+ CSS = """
21
+ :root {
22
+ --ink: #111827;
23
+ --muted: #64748b;
24
+ --paper: #f8fafc;
25
+ --line: #d8dee9;
26
+ --add-bg: #ecfdf3;
27
+ --add-ink: #166534;
28
+ --del-bg: #fff1f2;
29
+ --del-ink: #9f1239;
30
+ --warn: #b45309;
31
+ --crit: #be123c;
32
+ --nit: #475569;
33
+ }
34
 
35
+ .gradio-container {
36
+ max-width: 1280px !important;
37
+ }
38
 
39
+ #hero {
40
+ border-bottom: 1px solid var(--line);
41
+ padding: 18px 0 14px;
42
+ margin-bottom: 18px;
43
+ }
44
 
45
+ #hero h1 {
46
+ color: var(--ink);
47
+ font-size: 36px;
48
+ line-height: 1.05;
49
+ margin: 0;
50
+ letter-spacing: 0;
51
+ }
 
 
 
 
52
 
53
+ #hero p {
54
+ color: var(--muted);
55
+ margin: 8px 0 0;
56
+ font-size: 15px;
57
+ }
58
 
59
+ .score-grid {
60
+ display: grid;
61
+ grid-template-columns: repeat(4, minmax(0, 1fr));
62
+ gap: 10px;
63
+ margin: 12px 0 18px;
64
+ }
65
 
66
+ .score-card {
67
+ background: white;
68
+ border: 1px solid var(--line);
69
+ border-radius: 8px;
70
+ padding: 12px;
71
+ }
72
+
73
+ .score-label {
74
+ color: var(--muted);
75
+ font-size: 12px;
76
+ text-transform: uppercase;
77
+ }
78
+
79
+ .score-value {
80
+ color: var(--ink);
81
+ font-size: 24px;
82
+ font-weight: 700;
83
+ margin-top: 2px;
84
+ }
85
+
86
+ .diff-wrap {
87
+ background: white;
88
+ border: 1px solid var(--line);
89
+ border-radius: 8px;
90
+ overflow: hidden;
91
+ }
92
+
93
+ .file-title {
94
+ background: #0f172a;
95
+ color: white;
96
+ font: 700 13px ui-monospace, SFMono-Regular, Menlo, monospace;
97
+ padding: 10px 12px;
98
+ }
99
+
100
+ .hunk-title {
101
+ background: #e0f2fe;
102
+ color: #075985;
103
+ font: 700 12px ui-monospace, SFMono-Regular, Menlo, monospace;
104
+ padding: 7px 12px;
105
+ border-top: 1px solid var(--line);
106
+ }
107
+
108
+ .line {
109
+ display: grid;
110
+ grid-template-columns: 54px 1fr;
111
+ min-height: 26px;
112
+ border-top: 1px solid #eef2f7;
113
+ font: 13px/1.55 ui-monospace, SFMono-Regular, Menlo, monospace;
114
+ }
115
+
116
+ .line-no {
117
+ color: #94a3b8;
118
+ background: #f8fafc;
119
+ border-right: 1px solid #eef2f7;
120
+ padding: 3px 8px;
121
+ text-align: right;
122
+ user-select: none;
123
+ }
124
+
125
+ .line-code {
126
+ white-space: pre-wrap;
127
+ overflow-wrap: anywhere;
128
+ padding: 3px 10px;
129
+ }
130
+
131
+ .line.add .line-code {
132
+ background: var(--add-bg);
133
+ color: var(--add-ink);
134
+ }
135
+
136
+ .line.del .line-code {
137
+ background: var(--del-bg);
138
+ color: var(--del-ink);
139
+ }
140
+
141
+ .finding {
142
+ border-top: 1px solid var(--line);
143
+ padding: 10px 12px 12px 66px;
144
+ background: #fff7ed;
145
+ }
146
+
147
+ .finding.critical {
148
+ background: #fff1f2;
149
+ }
150
+
151
+ .finding.nitpick {
152
+ background: #f8fafc;
153
+ }
154
+
155
+ .badge {
156
+ border-radius: 999px;
157
+ color: white;
158
+ display: inline-block;
159
+ font-size: 11px;
160
+ font-weight: 700;
161
+ margin-right: 6px;
162
+ padding: 2px 8px;
163
+ text-transform: uppercase;
164
+ }
165
+
166
+ .badge.critical { background: var(--crit); }
167
+ .badge.warning { background: var(--warn); }
168
+ .badge.nitpick { background: var(--nit); }
169
+ .category {
170
+ color: var(--muted);
171
+ font-size: 12px;
172
+ font-weight: 700;
173
+ text-transform: uppercase;
174
+ }
175
+
176
+ .finding-body {
177
+ color: var(--ink);
178
+ margin-top: 6px;
179
+ }
180
+
181
+ .suggestion {
182
+ color: #334155;
183
+ margin-top: 5px;
184
+ }
185
+
186
+ .empty-state {
187
+ background: white;
188
+ border: 1px dashed var(--line);
189
+ border-radius: 8px;
190
+ color: var(--muted);
191
+ padding: 18px;
192
+ }
193
+
194
+ @media (max-width: 760px) {
195
+ .score-grid { grid-template-columns: repeat(2, minmax(0, 1fr)); }
196
+ #hero h1 { font-size: 28px; }
197
+ .line { grid-template-columns: 42px 1fr; font-size: 12px; }
198
+ .finding { padding-left: 52px; }
199
+ }
200
  """
201
+
202
+
203
+ SAMPLE_DIFF = "\n".join(
204
+ [
205
+ "diff --git a/src/auth.py b/src/auth.py",
206
+ "index 54d88cd..b2a1772 100644",
207
+ "--- a/src/auth.py",
208
+ "+++ b/src/auth.py",
209
+ "@@ -1,9 +1,13 @@",
210
+ " import jwt",
211
+ "+import pickle",
212
+ " import requests",
213
+ '+SECRET = "dev-secret-token"',
214
+ " ",
215
+ " def load_user(raw):",
216
+ "+ user = pickle.loads(raw)",
217
+ "+ return user",
218
+ "+",
219
+ " def verify(token):",
220
+ '- return jwt.decode(token, SECRET, algorithms=["HS256"])',
221
+ '+ return jwt.decode(token, SECRET, algorithms=["HS256"], options={"verify_signature": False})',
222
+ " ",
223
+ " def fetch_profile(url):",
224
+ "- return requests.get(url).json()",
225
+ "+ return requests.get(url, verify=False).json()",
226
+ "diff --git a/src/report.py b/src/report.py",
227
+ "index 7471fee..db2ab78 100644",
228
+ "--- a/src/report.py",
229
+ "+++ b/src/report.py",
230
+ "@@ -8,8 +8,10 @@ def build_query(user_id):",
231
+ '- return "select * from events where user_id = " + user_id',
232
+ '+ return f"select * from events where user_id = {user_id}"',
233
+ " ",
234
+ " def summarize(items):",
235
+ "+ if len(items) == 0:",
236
+ "+ return None",
237
+ ' total = 0',
238
+ ' for item in items:',
239
+ ' total += item["amount"]',
240
+ " return total / len(items)",
241
+ ]
242
  )
243
 
244
+
245
+ @dataclass
246
+ class DiffLine:
247
+ kind: str
248
+ text: str
249
+ old_no: int | None = None
250
+ new_no: int | None = None
251
+
252
+
253
+ @dataclass
254
+ class Hunk:
255
+ header: str
256
+ old_start: int
257
+ new_start: int
258
+ lines: list[DiffLine] = field(default_factory=list)
259
+
260
+
261
+ @dataclass
262
+ class FileDiff:
263
+ path: str
264
+ hunks: list[Hunk] = field(default_factory=list)
265
+
266
+
267
+ @dataclass
268
+ class Finding:
269
+ file: str
270
+ hunk: str
271
+ line: int | None
272
+ severity: str
273
+ category: str
274
+ comment: str
275
+ suggestion: str
276
+ source: str = "deterministic"
277
+
278
+
279
+ RULES: list[dict[str, Any]] = [
280
+ {
281
+ "pattern": re.compile(r"(password|passwd|secret|token|api[_-]?key)\s*=\s*['\"][^'\"]{6,}", re.I),
282
+ "severity": "critical",
283
+ "category": "security",
284
+ "comment": "A credential-like value is being committed in the diff.",
285
+ "suggestion": "Move the value to a secret manager or environment variable and rotate the exposed secret.",
286
+ },
287
+ {
288
+ "pattern": re.compile(r"verify_signature['\"]?\s*:\s*False|verify\s*=\s*False", re.I),
289
+ "severity": "critical",
290
+ "category": "security",
291
+ "comment": "The change disables a verification check, which can turn a trusted boundary into a bypass.",
292
+ "suggestion": "Keep verification enabled and add a narrowly scoped test fixture for local development.",
293
+ },
294
+ {
295
+ "pattern": re.compile(r"\bpickle\.loads?\s*\(", re.I),
296
+ "severity": "critical",
297
+ "category": "security",
298
+ "comment": "Deserializing pickle data from an untrusted source can execute arbitrary code.",
299
+ "suggestion": "Use a safe format such as JSON or validate and sign the payload before deserialization.",
300
+ },
301
+ {
302
+ "pattern": re.compile(r"\beval\s*\(|\bexec\s*\(", re.I),
303
+ "severity": "critical",
304
+ "category": "security",
305
+ "comment": "Dynamic code execution appears in a changed line.",
306
+ "suggestion": "Replace dynamic execution with an explicit parser or allowlisted dispatch table.",
307
+ },
308
+ {
309
+ "pattern": re.compile(r"shell\s*=\s*True", re.I),
310
+ "severity": "critical",
311
+ "category": "security",
312
+ "comment": "Launching a shell with user-influenced input is command-injection prone.",
313
+ "suggestion": "Pass arguments as a list with shell disabled and validate each user-controlled argument.",
314
+ },
315
+ {
316
+ "pattern": re.compile(r"(f['\"].*(select|insert|update|delete)|(select|insert|update|delete).*(\+|format\s*\())", re.I),
317
+ "severity": "warning",
318
+ "category": "security",
319
+ "comment": "The SQL statement appears to be built with string interpolation.",
320
+ "suggestion": "Use parameterized queries so the database driver handles escaping and typing.",
321
+ },
322
+ {
323
+ "pattern": re.compile(r"except\s*:", re.I),
324
+ "severity": "warning",
325
+ "category": "logic",
326
+ "comment": "A bare except can hide interrupts and unrelated failures.",
327
+ "suggestion": "Catch the specific exception type and preserve the original error context.",
328
+ },
329
+ {
330
+ "pattern": re.compile(r"TODO|FIXME|HACK", re.I),
331
+ "severity": "nitpick",
332
+ "category": "maintainability",
333
+ "comment": "A temporary marker landed in changed code.",
334
+ "suggestion": "Link it to an issue or resolve it before merging.",
335
+ },
336
+ ]
337
+
338
+
339
+ def normalize_diff(raw_input: str) -> str:
340
+ value = (raw_input or "").strip()
341
+ if not value:
342
+ return ""
343
+
344
+ parsed = urlparse(value)
345
+ if parsed.netloc == "github.com" and "/pull/" in parsed.path:
346
+ return fetch_public_diff(value)
347
+
348
+ if parsed.scheme in {"http", "https"} and value.endswith(".diff"):
349
+ return fetch_public_diff(value)
350
+
351
+ return value
352
+
353
+
354
+ def fetch_public_diff(url: str) -> str:
355
+ diff_url = url if url.endswith(".diff") else f"{url.rstrip('/')}.diff"
356
+ request = Request(diff_url, headers={"User-Agent": "DiffSense/1.0"})
357
+ try:
358
+ with urlopen(request, timeout=FETCH_TIMEOUT_SECONDS) as response:
359
+ content_type = response.headers.get("content-type", "")
360
+ body = response.read(1_500_000).decode("utf-8", errors="replace")
361
+ except Exception as exc:
362
+ raise gr.Error(f"Could not fetch the public diff from {diff_url}: {exc}") from exc
363
+
364
+ if "@@ " not in body:
365
+ raise gr.Error(
366
+ f"Fetched {diff_url}, but it did not look like a unified diff "
367
+ f"(content-type: {content_type or 'unknown'})."
368
+ )
369
+
370
+ return body
371
+
372
+
373
+ def parse_hunk_header(header: str) -> tuple[int, int]:
374
+ match = re.search(r"@@ -(?P<old>\d+)(?:,\d+)? \+(?P<new>\d+)(?:,\d+)? @@", header)
375
+ if not match:
376
+ return 0, 0
377
+ return int(match.group("old")), int(match.group("new"))
378
+
379
+
380
+ def parse_unified_diff(diff_text: str) -> list[FileDiff]:
381
+ files: list[FileDiff] = []
382
+ current_file: FileDiff | None = None
383
+ current_hunk: Hunk | None = None
384
+ old_no = 0
385
+ new_no = 0
386
+
387
+ for raw_line in diff_text.splitlines():
388
+ if raw_line.startswith("diff --git "):
389
+ current_file = None
390
+ current_hunk = None
391
+ continue
392
+
393
+ if raw_line.startswith("+++ "):
394
+ path = raw_line[4:].strip()
395
+ if path.startswith("b/"):
396
+ path = path[2:]
397
+ current_file = FileDiff(path=path)
398
+ files.append(current_file)
399
+ current_hunk = None
400
+ continue
401
+
402
+ if raw_line.startswith("@@ "):
403
+ if current_file is None:
404
+ current_file = FileDiff(path="pasted.diff")
405
+ files.append(current_file)
406
+ old_start, new_start = parse_hunk_header(raw_line)
407
+ old_no = old_start
408
+ new_no = new_start
409
+ current_hunk = Hunk(header=raw_line, old_start=old_start, new_start=new_start)
410
+ current_file.hunks.append(current_hunk)
411
+ continue
412
+
413
+ if current_hunk is None:
414
+ continue
415
+
416
+ if raw_line.startswith("+") and not raw_line.startswith("+++"):
417
+ current_hunk.lines.append(DiffLine("add", raw_line[1:], new_no=new_no))
418
+ new_no += 1
419
+ elif raw_line.startswith("-") and not raw_line.startswith("---"):
420
+ current_hunk.lines.append(DiffLine("del", raw_line[1:], old_no=old_no))
421
+ old_no += 1
422
+ elif raw_line.startswith("\\"):
423
+ continue
424
+ else:
425
+ text = raw_line[1:] if raw_line.startswith(" ") else raw_line
426
+ current_hunk.lines.append(DiffLine("ctx", text, old_no=old_no, new_no=new_no))
427
+ old_no += 1
428
+ new_no += 1
429
+
430
+ return files
431
+
432
+
433
+ def review_diff(files: list[FileDiff]) -> list[Finding]:
434
+ findings: list[Finding] = []
435
+
436
+ for file_diff in files:
437
+ for hunk in file_diff.hunks:
438
+ added_lines = [line for line in hunk.lines if line.kind == "add"]
439
+ removed_lines = [line for line in hunk.lines if line.kind == "del"]
440
+
441
+ for line in added_lines:
442
+ for rule in RULES:
443
+ if rule["pattern"].search(line.text):
444
+ findings.append(
445
+ Finding(
446
+ file=file_diff.path,
447
+ hunk=hunk.header,
448
+ line=line.new_no,
449
+ severity=rule["severity"],
450
+ category=rule["category"],
451
+ comment=rule["comment"],
452
+ suggestion=rule["suggestion"],
453
+ )
454
+ )
455
+
456
+ added_text = "\n".join(line.text for line in added_lines)
457
+ removed_text = "\n".join(line.text for line in removed_lines)
458
+
459
+ if re.search(r"return\s+None", added_text) and "Optional" not in added_text:
460
+ findings.append(
461
+ Finding(
462
+ file=file_diff.path,
463
+ hunk=hunk.header,
464
+ line=added_lines[0].new_no if added_lines else None,
465
+ severity="warning",
466
+ category="logic",
467
+ comment="The new branch returns None, which may change the function's return contract.",
468
+ suggestion="Return a neutral value of the same type or update callers and tests to handle None explicitly.",
469
+ )
470
+ )
471
+
472
+ if "len(" in added_text and "/ len(" in removed_text:
473
+ findings.append(
474
+ Finding(
475
+ file=file_diff.path,
476
+ hunk=hunk.header,
477
+ line=added_lines[0].new_no if added_lines else None,
478
+ severity="warning",
479
+ category="test",
480
+ comment="This change appears to address an empty collection path; make sure the regression is locked down.",
481
+ suggestion="Add a test covering an empty input and a non-empty input for the same function.",
482
+ )
483
+ )
484
+
485
+ if len(added_lines) >= 25 and not any("test" in file_diff.path.lower() for _ in [0]):
486
+ findings.append(
487
+ Finding(
488
+ file=file_diff.path,
489
+ hunk=hunk.header,
490
+ line=added_lines[0].new_no if added_lines else None,
491
+ severity="nitpick",
492
+ category="test",
493
+ comment="This hunk adds a substantial amount of behavior outside a test file.",
494
+ suggestion="Add or update a focused test that exercises the new branch.",
495
+ )
496
+ )
497
+
498
+ return dedupe_findings(findings)
499
+
500
+
501
+ def dedupe_findings(findings: list[Finding]) -> list[Finding]:
502
+ seen: set[tuple[str, str, int | None, str]] = set()
503
+ unique: list[Finding] = []
504
+ for finding in findings:
505
+ key = (finding.file, finding.category, finding.line, finding.comment)
506
+ if key not in seen:
507
+ seen.add(key)
508
+ unique.append(finding)
509
+
510
+ severity_order = {"critical": 0, "warning": 1, "nitpick": 2}
511
+ unique.sort(key=lambda item: (severity_order.get(item.severity, 9), item.file, item.line or 0))
512
+ return unique
513
+
514
+
515
+ def summarize_with_model(
516
+ files: list[FileDiff],
517
+ findings: list[Finding],
518
+ enabled: bool,
519
+ hf_token: gr.OAuthToken | None = None,
520
+ ) -> str:
521
+ if not enabled:
522
+ return "Model summary disabled. Deterministic review completed locally in the app process."
523
+
524
+ token = hf_token.token if hf_token else os.getenv("HF_TOKEN")
525
+ if not token:
526
+ return "Model summary skipped: sign in with Hugging Face OAuth or set HF_TOKEN."
527
+
528
+ compact_diff = "\n".join(
529
+ f"{file.path}\n"
530
+ + "\n".join(
531
+ f"{hunk.header}\n"
532
+ + "\n".join(
533
+ f"{'+' if line.kind == 'add' else '-' if line.kind == 'del' else ' '} {line.text}"
534
+ for line in hunk.lines[:80]
535
+ )
536
+ for hunk in file.hunks[:4]
537
+ )
538
+ for file in files[:6]
539
+ )
540
+ deterministic = json.dumps([finding_to_dict(item) for item in findings[:12]], indent=2)
541
+
542
+ messages = [
543
+ {
544
+ "role": "system",
545
+ "content": (
546
+ "You are DiffSense, a terse senior code reviewer. Summarize the review risk in "
547
+ "four bullets. Do not invent findings beyond the provided deterministic findings."
548
+ ),
549
+ },
550
+ {
551
+ "role": "user",
552
+ "content": (
553
+ f"Deterministic findings:\n{deterministic}\n\n"
554
+ f"Diff excerpt:\n{compact_diff[:12000]}"
555
+ ),
556
+ },
557
+ ]
558
+
559
+ try:
560
+ client = InferenceClient(token=token, model=DEFAULT_MODEL)
561
+ response = client.chat_completion(
562
+ messages=messages,
563
+ max_tokens=320,
564
+ temperature=0.2,
565
+ top_p=0.9,
566
+ )
567
+ return response.choices[0].message.content or "Model returned an empty summary."
568
+ except Exception as exc: # The app must stay demoable when endpoints are unavailable.
569
+ return f"Model summary unavailable from {DEFAULT_MODEL}: {exc}"
570
+
571
+
572
+ def finding_to_dict(finding: Finding) -> dict[str, Any]:
573
+ return {
574
+ "file": finding.file,
575
+ "hunk": finding.hunk,
576
+ "line": finding.line,
577
+ "severity": finding.severity,
578
+ "category": finding.category,
579
+ "comment": finding.comment,
580
+ "suggestion": finding.suggestion,
581
+ "source": finding.source,
582
+ }
583
+
584
+
585
+ def render_scoreboard(files: list[FileDiff], findings: list[Finding]) -> str:
586
+ hunk_count = sum(len(file.hunks) for file in files)
587
+ counts = {
588
+ "critical": sum(item.severity == "critical" for item in findings),
589
+ "warning": sum(item.severity == "warning" for item in findings),
590
+ "nitpick": sum(item.severity == "nitpick" for item in findings),
591
+ }
592
+ return f"""
593
+ <div class="score-grid">
594
+ <div class="score-card"><div class="score-label">Files</div><div class="score-value">{len(files)}</div></div>
595
+ <div class="score-card"><div class="score-label">Hunks</div><div class="score-value">{hunk_count}</div></div>
596
+ <div class="score-card"><div class="score-label">Critical</div><div class="score-value">{counts["critical"]}</div></div>
597
+ <div class="score-card"><div class="score-label">Warnings</div><div class="score-value">{counts["warning"]}</div></div>
598
+ </div>
599
+ """
600
+
601
+
602
+ def render_review(files: list[FileDiff], findings: list[Finding]) -> str:
603
+ if not files:
604
+ return '<div class="empty-state">Paste a unified diff to see inline review findings.</div>'
605
+
606
+ findings_by_location: dict[tuple[str, str, int | None], list[Finding]] = {}
607
+ for finding in findings:
608
+ findings_by_location.setdefault((finding.file, finding.hunk, finding.line), []).append(finding)
609
+
610
+ chunks = [render_scoreboard(files, findings), '<div class="diff-wrap">']
611
+
612
+ for file_diff in files:
613
+ chunks.append(f'<div class="file-title">{html.escape(file_diff.path)}</div>')
614
+ for hunk in file_diff.hunks:
615
+ chunks.append(f'<div class="hunk-title">{html.escape(hunk.header)}</div>')
616
+ for line in hunk.lines:
617
+ number = line.new_no if line.kind == "add" else line.old_no
618
+ sign = "+" if line.kind == "add" else "-" if line.kind == "del" else " "
619
+ chunks.append(
620
+ f'<div class="line {line.kind}">'
621
+ f'<div class="line-no">{number if number is not None else ""}</div>'
622
+ f'<div class="line-code">{html.escape(sign + line.text)}</div>'
623
+ f"</div>"
624
+ )
625
+ for finding in findings_by_location.get((file_diff.path, hunk.header, line.new_no), []):
626
+ chunks.append(render_finding(finding))
627
+
628
+ for finding in findings_by_location.get((file_diff.path, hunk.header, None), []):
629
+ chunks.append(render_finding(finding))
630
+
631
+ chunks.append("</div>")
632
+ return "\n".join(chunks)
633
+
634
+
635
+ def render_finding(finding: Finding) -> str:
636
+ return f"""
637
+ <div class="finding {html.escape(finding.severity)}">
638
+ <span class="badge {html.escape(finding.severity)}">{html.escape(finding.severity)}</span>
639
+ <span class="category">{html.escape(finding.category)}</span>
640
+ <div class="finding-body">{html.escape(finding.comment)}</div>
641
+ <div class="suggestion"><strong>Fix:</strong> {html.escape(finding.suggestion)}</div>
642
+ </div>
643
+ """
644
+
645
+
646
+ def run_review(
647
+ diff_input: str,
648
+ use_model_summary: bool,
649
+ hf_token: gr.OAuthToken | None = None,
650
+ ) -> tuple[str, list[dict[str, Any]], str]:
651
+ diff_text = normalize_diff(diff_input)
652
+ if not diff_text:
653
+ raise gr.Error("Paste a unified diff first, or load the sample diff.")
654
+
655
+ files = parse_unified_diff(diff_text)
656
+ if not files or not any(file.hunks for file in files):
657
+ raise gr.Error("I could not find unified diff hunks. Look for lines starting with @@.")
658
+
659
+ findings = review_diff(files)
660
+ summary = summarize_with_model(files, findings, use_model_summary, hf_token)
661
+ return render_review(files, findings), [finding_to_dict(item) for item in findings], summary
662
+
663
+
664
+ def load_sample() -> str:
665
+ return SAMPLE_DIFF
666
+
667
+
668
+ APP_THEME = gr.themes.Soft(primary_hue="slate", neutral_hue="slate")
669
+
670
+
671
  with gr.Blocks() as demo:
672
+ gr.HTML(
673
+ """
674
+ <div id="hero">
675
+ <h1>DiffSense</h1>
676
+ <p>Private, offline-first PR review for the Build Small hackathon. Paste a diff or public GitHub PR URL, get severity-tagged findings, keep your code out of SaaS review tools.</p>
677
+ </div>
678
+ """
679
+ )
680
+
681
  with gr.Sidebar():
682
  gr.LoginButton()
683
+ use_model_summary = gr.Checkbox(
684
+ value=False,
685
+ label="Add optional Mellum model summary",
686
+ info="Deterministic review works without network or GPU. OAuth/HF_TOKEN enables the sponsor-model summary.",
687
+ )
688
+ sample_btn = gr.Button("Load sample diff")
689
+
690
+ with gr.Row(equal_height=False):
691
+ with gr.Column(scale=5):
692
+ diff_input = gr.Textbox(
693
+ value="",
694
+ lines=24,
695
+ max_lines=32,
696
+ label="Unified diff or public GitHub PR URL",
697
+ placeholder="Paste a unified diff, paste https://github.com/org/repo/pull/123, or click Load sample diff.",
698
+ interactive=True,
699
+ )
700
+ run_btn = gr.Button("Review diff", variant="primary")
701
+ with gr.Column(scale=4):
702
+ summary_output = gr.Markdown(
703
+ value="Run a review to get the risk summary.",
704
+ label="Reviewer summary",
705
+ )
706
+ json_output = gr.JSON(label="Structured findings")
707
+
708
+ review_output = gr.HTML(
709
+ value='<div class="empty-state">Paste a unified diff or public GitHub PR URL, then click Review diff.</div>',
710
+ label="Inline diff review",
711
+ )
712
+
713
+ sample_btn.click(fn=load_sample, outputs=diff_input)
714
+ run_btn.click(
715
+ fn=run_review,
716
+ inputs=[diff_input, use_model_summary],
717
+ outputs=[review_output, json_output, summary_output],
718
+ )
719
 
720
 
721
  if __name__ == "__main__":
722
+ demo.launch(css=CSS, theme=APP_THEME)
requirements.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ gradio[oauth]==6.5.1
2
+ huggingface_hub>=0.22.2