File size: 7,423 Bytes
c29f835
5f029e1
3a679f6
 
 
c29f835
 
 
 
 
 
3a679f6
5f029e1
3a679f6
 
 
 
 
 
 
 
 
 
 
2616e64
 
 
 
3a679f6
7eb80e6
2616e64
 
 
c29f835
 
3a679f6
 
 
 
2616e64
3a679f6
 
 
 
 
 
 
 
 
 
 
 
 
2616e64
 
 
 
3a679f6
7eb80e6
 
3a679f6
 
 
 
 
 
 
 
 
 
 
2616e64
 
 
 
3a679f6
 
 
 
 
 
 
7eb80e6
3a679f6
7eb80e6
 
 
2616e64
3a679f6
 
 
7eb80e6
 
 
 
 
 
 
 
 
 
 
 
 
3a679f6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7eb80e6
3a679f6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4de614b
 
c2ad0b7
4de614b
 
 
 
3a679f6
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
---
title: DiffSense
emoji: 🔎
colorFrom: gray
colorTo: yellow
sdk: gradio
sdk_version: 6.5.1
app_file: app.py
pinned: false
hf_oauth: true
hf_oauth_scopes:
  - inference-api
license: mit
short_description: Private PR review for local AI teams.
tags:
  - build-small
  - gradio
  - code-review
  - local-ai
  - backyard-ai
  - best-use-of-codex
  - best-agent
  - off-brand
  - best-demo
  - best-minicpm-build
  - nemotron-hardware-prize
  - best-use-of-modal
  - tiny-titan
models:
  - JetBrains/Mellum2-12B-A2.5B-Instruct
  - nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16
  - nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16
  - openbmb/MiniCPM-V-4.6
---

# DiffSense

Private, offline-first pull request review for teams that cannot send proprietary code to cloud review bots.

Paste a unified diff or a public GitHub PR URL and DiffSense returns severity-tagged findings, inline comments, and structured JSON that can be copied into a PR review. The prototype works without a GPU by using deterministic review rules, then optionally adds Mellum, Nemotron, MiniCPM-V, and Modal provider passes when credentials or endpoints are available.

## Why We Built It

Code review is one of the highest-leverage daily engineering workflows, but most AI reviewers require sending private code to a hosted SaaS. That is a deal-breaker for teams working with customer data, internal APIs, security-sensitive systems, or unreleased products.

DiffSense is the small-model version of that workflow: useful immediately, inspectable, and designed so the core review loop can run locally.

## What Works Now

- Unified diff parser with file and hunk awareness.
- Inline custom diff viewer built in Gradio.
- Deterministic review findings for security, logic, maintainability, and test risks.
- Public GitHub PR URL fetching through the PR `.diff` endpoint.
- Optional Nemotron 3 Nano routing/triage pass.
- Optional Tiny Titan 4B checker pass.
- Optional MiniCPM-V 4.6 vision pass for PR screenshots, architecture diagrams, and UI diffs.
- Optional Modal bridge through `DIFFSENSE_MODAL_ENDPOINT`.
- Structured JSON output with file, hunk, line, severity, category, comment, and suggestion.
- Optional model-assisted summary using `JetBrains/Mellum2-12B-A2.5B-Instruct` through the Hugging Face Inference API when OAuth is available, or a local checkpoint when mounted under `/data`.
- ZeroGPU/bucket-aware model runtime status for local checkpoints mounted from the `build-small-hackathon/DiffSense` bucket.

## Hackathon Track

DiffSense is entered in the Backyard AI track: a practical tool for developers that solves a real daily problem.

Prize/badge targets:

- Best Use of Codex: Codex is being used as an active build partner and will be credited in commits.
- Best Agent: the product is structured as a review pipeline: parse, classify, review, summarize, render.
- Off Brand: the app uses a custom Gradio interface instead of the default chat UI.
- Best Demo: the workflow is easy to show in under two minutes with a real risky diff.
- Best MiniCPM Build: MiniCPM-V 4.6 is integrated for optional image/diagram context.
- Nemotron Hardware Prize: Nemotron 3 Nano is integrated for optional agentic routing.
- Best Use of Modal: the app includes a provider bridge for a Modal-hosted review endpoint via `DIFFSENSE_MODAL_ENDPOINT`.
- Tiny Titan: a <=4B Nemotron 3 Nano checker is integrated as a separate optional pass.

## Planned Model Stack

All planned models are under the Build Small 32B parameter cap.

| Role | Model | Status |
| --- | --- | --- |
| Code review summary | JetBrains Mellum 2 12B Instruct | Optional HF inference hook + `/data` local checkpoint path implemented |
| Provider | Hugging Face Inference API | Optional OAuth-backed summary provider |
| Agentic routing | NVIDIA Nemotron 3 Nano | Optional HF inference hook + `/data` local checkpoint path implemented |
| Tiny checker | NVIDIA Nemotron 3 Nano 4B | Optional HF inference hook + `/data` local checkpoint path implemented |
| Visual PR context | OpenBMB MiniCPM-V 4.6 | Optional image upload + provider/local checkpoint readiness implemented |
| Runtime | Modal | Optional provider bridge via `DIFFSENSE_MODAL_ENDPOINT` implemented |

The current app intentionally keeps a deterministic fallback so the demo remains reliable even if a hosted model endpoint is cold, rate-limited, or unavailable.

## Local Checkpoint Layout

The Space is configured with a read/write bucket mounted at `/data`, so model files can be staged without committing checkpoints to the app repo. DiffSense checks these paths at runtime:

```text
/data/models/mellum2-instruct
/data/models/nemotron-3-nano-30b-a3b
/data/models/nemotron-3-nano-4b
/data/models/minicpm-v-4.6
```

Each directory is considered ready when it contains a `config.json`. If a Hugging Face provider does not serve a sponsor model, the app reports the provider limitation cleanly and keeps the deterministic review running.

## Usage

1. Open the Space.
2. Paste a unified diff, paste a public GitHub PR URL, or click **Load sample diff**.
3. Click **Review diff**.
4. Read the inline comments and copy the structured JSON into your PR workflow.

For public GitHub PRs, paste the PR URL directly. DiffSense fetches the `.diff` version with a short timeout.

## Output Shape

```json
{
  "file": "src/auth.py",
  "hunk": "@@ -1,9 +1,13 @@",
  "line": 11,
  "severity": "critical",
  "category": "security",
  "comment": "The change disables a verification check, which can turn a trusted boundary into a bypass.",
  "suggestion": "Keep verification enabled and add a narrowly scoped test fixture for local development.",
  "source": "deterministic"
}
```

## Privacy

The deterministic review path runs inside the app process and does not send the pasted diff to any external model. If a public PR URL is pasted, the app fetches its public `.diff` over the network. If an optional hosted model pass is enabled, the diff excerpt and deterministic findings are sent to the selected Hugging Face Inference model using the signed-in user's OAuth token. If a local checkpoint is mounted under `/data/models`, that local path is preferred for text-model passes.

## Local Run

```bash
pip install -r requirements.txt
python app.py
```

Then open `http://localhost:7860`.

## Demo Script

1. Start with the privacy pain: cloud review bots are useful, but private code cannot always leave the machine.
2. Load the sample diff.
3. Show critical findings: hardcoded secret, disabled JWT verification, insecure pickle load, disabled TLS verification.
4. Show the JSON output as a practical artifact for PR automation.
5. Toggle the optional model summary to show the small-model enhancement path.

## Submission Artifacts

- [Demo video](https://drive.google.com/file/d/1PBLGO10Wg94jX4OmYVDh63fxFcK6j_kp/view?usp=sharing)
- [HF technical paper](HF_TECH_PAPER.md)
- [LinkedIn post draft](LINKEDIN_POST.md)
- [Demo video pitch](DEMO_VIDEO_PITCH.md)

## Social Post Draft

DiffSense is our Build Small hackathon project: a private PR reviewer for teams that cannot send proprietary code to cloud bots.

Paste a diff or public PR URL, get inline severity-tagged review comments and structured JSON. The app works offline first for pasted diffs, with optional small-model summarization through Mellum 2.

Built with Gradio, Codex, and open-weight model targets under 32B.

#BuildSmall #HuggingFace #Gradio #LocalAI #CodeReview