Spaces:
Runtime error
A newer version of the Gradio SDK is available: 6.19.0
DiffSense Technical Design
Goal
Build a useful, demoable, privacy-first pull request reviewer for the Build Small hackathon. The app must work reliably inside a Gradio Space and stay eligible for the under-32B model constraint.
The implementation is intentionally offline-first: deterministic review rules provide the core value, and small-model inference is an optional enhancement rather than a single point of failure.
Current Shipped Prototype
Unified diff input or public GitHub PR URL
-> stdlib diff parser
-> deterministic review engine
-> structured findings
-> custom Gradio HTML diff viewer
-> optional Mellum 2 summary via HF OAuth
-> optional Nemotron 3 Nano routing via HF OAuth
-> optional Nemotron 3 Nano 4B Tiny Titan check via HF OAuth
-> optional MiniCPM-V 4.6 vision notes via HF OAuth
-> optional local checkpoints from /data/models on ZeroGPU
-> optional Modal bridge via DIFFSENSE_MODAL_ENDPOINT
Components
Gradio UI
File: app.py
- Uses
gr.Blocksinstead of the default chatbot scaffold. - Provides a sample risky diff for a one-click demo.
- Accepts pasted unified diffs and public GitHub PR URLs.
- Renders an inline diff view with file headers, hunk headers, line numbers, severity badges, comments, and suggested fixes.
- Shows structured JSON for automation and judge inspection.
- Exposes model/provider toggles for Mellum, Nemotron, Tiny Titan, MiniCPM-V, and Modal.
- Accepts PR screenshots or diagrams for the MiniCPM-V vision pass.
Diff Parser
The input layer fetches public GitHub PR URLs through their .diff endpoint with a short timeout. Pasted diffs are handled entirely in-process.
The parser handles standard unified diffs:
diff --gitfile boundaries.+++ b/pathfile names.@@ -old,+new @@hunk headers.- Added, removed, and context lines with old/new line numbers.
No external parser is required, which keeps startup fast and dependency risk low.
Review Engine
The deterministic engine checks added lines for high-signal review risks:
- Hardcoded credentials.
- Disabled verification such as TLS or JWT signature checks.
- Unsafe deserialization with
pickle. - Dynamic execution through
evalorexec. shell=Truecommand execution.- SQL string interpolation.
- Bare
except:. - Temporary
TODO,FIXME, orHACKmarkers. - Return-contract changes such as newly introduced
return None.
Each finding includes:
{
"file": "src/auth.py",
"hunk": "@@ -1,9 +1,13 @@",
"line": 11,
"severity": "critical",
"category": "security",
"comment": "The change disables a verification check, which can turn a trusted boundary into a bypass.",
"suggestion": "Keep verification enabled and add a narrowly scoped test fixture for local development.",
"source": "deterministic"
}
Optional Model Summary
When enabled, the app uses the signed-in Hugging Face OAuth token or HF_TOKEN through the Hugging Face Inference API to call:
JetBrains/Mellum2-12B-A2.5B-Instruct
The model is asked to summarize the deterministic findings rather than invent new findings. This keeps the model role narrow, fast, and auditable.
If /data/models/mellum2-instruct/config.json exists, the app prefers that local checkpoint path before calling the hosted provider.
Optional Nemotron Router
When enabled, the app calls:
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16
Nemotron receives deterministic findings plus a compact diff excerpt and returns a triage plan: merge risk, files to inspect first, and follow-up tests. If the endpoint is unavailable, the app shows a deterministic routing fallback.
If /data/models/nemotron-3-nano-30b-a3b/config.json exists, the app treats the local checkpoint as the preferred runtime path.
Optional Tiny Titan Checker
When enabled, the app calls a <=4B model:
nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16
This pass returns a compact sanity check: missed-risk hypothesis, test recommendation, and merge decision. It exists as a separate small-model path for the Tiny Titan badge while keeping the main reviewer reliable.
If /data/models/nemotron-3-nano-4b/config.json exists, the app treats the local checkpoint as the preferred runtime path.
Optional MiniCPM-V Vision Pass
When enabled, uploaded PNG, JPEG, or WebP images are converted to data URLs and sent with the diff context to:
openbmb/MiniCPM-V-4.6
This is intended for PR screenshots, architecture diagrams, and UI diffs. The app limits image payload size and reports endpoint failures visibly instead of blocking the review.
If /data/models/minicpm-v-4.6/config.json exists, the app reports the local MiniCPM-V checkpoint as ready and keeps the image ingestion path available for a custom local loader.
ZeroGPU Bucket Mount
The Space has a read/write bucket mounted at /data. DiffSense checks the following model checkpoint locations at runtime and includes their status in the model-agent trace:
/data/models/mellum2-instruct
/data/models/nemotron-3-nano-30b-a3b
/data/models/nemotron-3-nano-4b
/data/models/minicpm-v-4.6
This keeps the app repo small while making the model integration path explicit for the hackathon badges. Hosted provider failures are converted into concise status notes rather than raw request errors.
Optional Modal Bridge
When DIFFSENSE_MODAL_ENDPOINT is configured, the app can POST the deterministic findings and compact diff context to a Modal-hosted review endpoint. Without that secret, the UI reports that the bridge is ready but not configured.
Hackathon Fit
Required criteria:
- Under 32B: Mellum, Nemotron 3 Nano 30B-A3B, Nemotron 3 Nano 4B, and MiniCPM-V 4.6 are all within the hackathon model-size constraint.
- Gradio app: implemented in
app.py. - README tags: included in
README.mdfront matter. - Demo-friendly: built-in sample diff produces multiple clear findings without setup.
Prize positioning:
- Backyard AI: practical developer workflow.
- Best Use of Codex: Codex is actively building and shaping the repo.
- Best Agent: staged review pipeline with parsing, classification, review, and summary.
- Off Brand: custom HTML diff UI instead of stock chat.
- Best Demo: one-click sample with visible before/after review value.
- Best MiniCPM Build: MiniCPM-V 4.6 image/diagram context path is implemented.
- Nemotron Hardware Prize: Nemotron 3 Nano routing path is implemented.
- Best Use of Modal: Modal endpoint bridge is implemented and controlled through a Space secret.
- Tiny Titan: Nemotron 3 Nano 4B checker path is implemented.
Planned Extensions
These should only be added after the current app is deployed and recorded:
- Add a hosted Modal endpoint and set
DIFFSENSE_MODAL_ENDPOINT. - Add downloadable
.patchfiles for suggested fixes. - Add richer multimodal demo assets for the MiniCPM-V path.
Risk Controls
- The app remains useful without model availability.
- Dependencies are limited to Gradio and
huggingface_hub. - No pasted diff is sent externally unless the user explicitly enables the model summary.
- Public PR URLs are fetched as public
.diffdocuments; private code should be pasted only when the model summary is off. - The sample diff demonstrates value even during GPU/API outages.
- Model/provider failures are rendered as agent trace notes rather than hard app failures.