Spaces:
Runtime error
Runtime error
| title: DiffSense | |
| emoji: 🔎 | |
| colorFrom: gray | |
| colorTo: yellow | |
| sdk: gradio | |
| sdk_version: 6.5.1 | |
| app_file: app.py | |
| pinned: false | |
| hf_oauth: true | |
| hf_oauth_scopes: | |
| - inference-api | |
| license: mit | |
| short_description: Private PR review for local AI teams. | |
| tags: | |
| - build-small | |
| - gradio | |
| - code-review | |
| - local-ai | |
| - backyard-ai | |
| - best-use-of-codex | |
| - best-agent | |
| - off-brand | |
| - best-demo | |
| - best-minicpm-build | |
| - nemotron-hardware-prize | |
| - best-use-of-modal | |
| - tiny-titan | |
| models: | |
| - JetBrains/Mellum2-12B-A2.5B-Instruct | |
| - nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 | |
| - nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16 | |
| - openbmb/MiniCPM-V-4.6 | |
| # DiffSense | |
| Private, offline-first pull request review for teams that cannot send proprietary code to cloud review bots. | |
| Paste a unified diff or a public GitHub PR URL and DiffSense returns severity-tagged findings, inline comments, and structured JSON that can be copied into a PR review. The prototype works without a GPU by using deterministic review rules, then optionally adds Mellum, Nemotron, MiniCPM-V, and Modal provider passes when credentials or endpoints are available. | |
| ## Why We Built It | |
| Code review is one of the highest-leverage daily engineering workflows, but most AI reviewers require sending private code to a hosted SaaS. That is a deal-breaker for teams working with customer data, internal APIs, security-sensitive systems, or unreleased products. | |
| DiffSense is the small-model version of that workflow: useful immediately, inspectable, and designed so the core review loop can run locally. | |
| ## What Works Now | |
| - Unified diff parser with file and hunk awareness. | |
| - Inline custom diff viewer built in Gradio. | |
| - Deterministic review findings for security, logic, maintainability, and test risks. | |
| - Public GitHub PR URL fetching through the PR `.diff` endpoint. | |
| - Optional Nemotron 3 Nano routing/triage pass. | |
| - Optional Tiny Titan 4B checker pass. | |
| - Optional MiniCPM-V 4.6 vision pass for PR screenshots, architecture diagrams, and UI diffs. | |
| - Optional Modal bridge through `DIFFSENSE_MODAL_ENDPOINT`. | |
| - Structured JSON output with file, hunk, line, severity, category, comment, and suggestion. | |
| - Optional model-assisted summary using `JetBrains/Mellum2-12B-A2.5B-Instruct` through the Hugging Face Inference API when OAuth is available, or a local checkpoint when mounted under `/data`. | |
| - ZeroGPU/bucket-aware model runtime status for local checkpoints mounted from the `build-small-hackathon/DiffSense` bucket. | |
| ## Hackathon Track | |
| DiffSense is entered in the Backyard AI track: a practical tool for developers that solves a real daily problem. | |
| Prize/badge targets: | |
| - Best Use of Codex: Codex is being used as an active build partner and will be credited in commits. | |
| - Best Agent: the product is structured as a review pipeline: parse, classify, review, summarize, render. | |
| - Off Brand: the app uses a custom Gradio interface instead of the default chat UI. | |
| - Best Demo: the workflow is easy to show in under two minutes with a real risky diff. | |
| - Best MiniCPM Build: MiniCPM-V 4.6 is integrated for optional image/diagram context. | |
| - Nemotron Hardware Prize: Nemotron 3 Nano is integrated for optional agentic routing. | |
| - Best Use of Modal: the app includes a provider bridge for a Modal-hosted review endpoint via `DIFFSENSE_MODAL_ENDPOINT`. | |
| - Tiny Titan: a <=4B Nemotron 3 Nano checker is integrated as a separate optional pass. | |
| ## Planned Model Stack | |
| All planned models are under the Build Small 32B parameter cap. | |
| | Role | Model | Status | | |
| | --- | --- | --- | | |
| | Code review summary | JetBrains Mellum 2 12B Instruct | Optional HF inference hook + `/data` local checkpoint path implemented | | |
| | Provider | Hugging Face Inference API | Optional OAuth-backed summary provider | | |
| | Agentic routing | NVIDIA Nemotron 3 Nano | Optional HF inference hook + `/data` local checkpoint path implemented | | |
| | Tiny checker | NVIDIA Nemotron 3 Nano 4B | Optional HF inference hook + `/data` local checkpoint path implemented | | |
| | Visual PR context | OpenBMB MiniCPM-V 4.6 | Optional image upload + provider/local checkpoint readiness implemented | | |
| | Runtime | Modal | Optional provider bridge via `DIFFSENSE_MODAL_ENDPOINT` implemented | | |
| The current app intentionally keeps a deterministic fallback so the demo remains reliable even if a hosted model endpoint is cold, rate-limited, or unavailable. | |
| ## Local Checkpoint Layout | |
| The Space is configured with a read/write bucket mounted at `/data`, so model files can be staged without committing checkpoints to the app repo. DiffSense checks these paths at runtime: | |
| ```text | |
| /data/models/mellum2-instruct | |
| /data/models/nemotron-3-nano-30b-a3b | |
| /data/models/nemotron-3-nano-4b | |
| /data/models/minicpm-v-4.6 | |
| ``` | |
| Each directory is considered ready when it contains a `config.json`. If a Hugging Face provider does not serve a sponsor model, the app reports the provider limitation cleanly and keeps the deterministic review running. | |
| ## Usage | |
| 1. Open the Space. | |
| 2. Paste a unified diff, paste a public GitHub PR URL, or click **Load sample diff**. | |
| 3. Click **Review diff**. | |
| 4. Read the inline comments and copy the structured JSON into your PR workflow. | |
| For public GitHub PRs, paste the PR URL directly. DiffSense fetches the `.diff` version with a short timeout. | |
| ## Output Shape | |
| ```json | |
| { | |
| "file": "src/auth.py", | |
| "hunk": "@@ -1,9 +1,13 @@", | |
| "line": 11, | |
| "severity": "critical", | |
| "category": "security", | |
| "comment": "The change disables a verification check, which can turn a trusted boundary into a bypass.", | |
| "suggestion": "Keep verification enabled and add a narrowly scoped test fixture for local development.", | |
| "source": "deterministic" | |
| } | |
| ``` | |
| ## Privacy | |
| The deterministic review path runs inside the app process and does not send the pasted diff to any external model. If a public PR URL is pasted, the app fetches its public `.diff` over the network. If an optional hosted model pass is enabled, the diff excerpt and deterministic findings are sent to the selected Hugging Face Inference model using the signed-in user's OAuth token. If a local checkpoint is mounted under `/data/models`, that local path is preferred for text-model passes. | |
| ## Local Run | |
| ```bash | |
| pip install -r requirements.txt | |
| python app.py | |
| ``` | |
| Then open `http://localhost:7860`. | |
| ## Demo Script | |
| 1. Start with the privacy pain: cloud review bots are useful, but private code cannot always leave the machine. | |
| 2. Load the sample diff. | |
| 3. Show critical findings: hardcoded secret, disabled JWT verification, insecure pickle load, disabled TLS verification. | |
| 4. Show the JSON output as a practical artifact for PR automation. | |
| 5. Toggle the optional model summary to show the small-model enhancement path. | |
| ## Submission Artifacts | |
| - [Demo video](https://drive.google.com/file/d/1PBLGO10Wg94jX4OmYVDh63fxFcK6j_kp/view?usp=sharing) | |
| - [HF technical paper](HF_TECH_PAPER.md) | |
| - [LinkedIn post draft](LINKEDIN_POST.md) | |
| - [Demo video pitch](DEMO_VIDEO_PITCH.md) | |
| ## Social Post Draft | |
| DiffSense is our Build Small hackathon project: a private PR reviewer for teams that cannot send proprietary code to cloud bots. | |
| Paste a diff or public PR URL, get inline severity-tagged review comments and structured JSON. The app works offline first for pasted diffs, with optional small-model summarization through Mellum 2. | |
| Built with Gradio, Codex, and open-weight model targets under 32B. | |
| #BuildSmall #HuggingFace #Gradio #LocalAI #CodeReview | |