Spaces:
Running on Zero
Running on Zero
A newer version of the Gradio SDK is available: 6.19.0
metadata
title: CommitLens
emoji: π
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 6.18.0
python_version: '3.12'
app_file: app.py
pinned: true
license: mit
short_description: urn any Git commit into a human-readable engineering report.
tags:
- track:backyard
- sponsor:nvidia
- achievement:offgrid
- achievement:offbrand
- achievement:sharing
- achievement:fieldnotes
CommitLens β AI-Powered Code Review Pipeline
https://huggingface.co/pkheria
CommitLens is a high-performance information extraction and analysis pipeline that transforms raw GitHub diffs into structured, human-readable engineering reports. It uses a hybrid LLM approach: JetBrains Mellum 2 for deep per-file analysis and Groq-hosted Llama 3.3 for lightning-fast synthesis.
π Resources & Links
- Demo Video: Watch the Product Demo
- Social Post : Linkedin Post
π Key Features
- Automated Diff Extraction: Fetches the two latest commits from any GitHub repository and generates semantic diffs.
- Top-Impact Filtering: Automatically identifies and prioritizes the most significant changes (top 2 files by lines changed) to ensure high-signal reviews.
- Hybrid LLM Pipeline:
- Mellum 2 (12B): Performs surgical, per-file code analysis. Optimized with 6-bit NF4 quantization for efficient GPU utilization.
- Groq (Llama 3.3 70B): Generates a high-level executive summary and key takeaways in milliseconds.
- Cinematic UI: A bespoke, low-latency frontend featuring a custom particle engine, real-time status tracking, and a "git-graph" hero visualization.
π Tech Stack
- Core: Python 3.12, FastAPI, Gradio (Server Mode).
- ML/Inference:
transformers,bitsandbytes(4-bit/6-bit quantization),torch,spaces(ZeroGPU). - APIs: GitHub REST API, Groq Cloud API.
- Frontend: Vanilla JavaScript (ES6+), HTML5 Canvas, CSS3 Grid/Flexbox.
π Project Structure
| File | Purpose |
|---|---|
app.py |
Main application server; manages model lifecycle and GPU/API orchestration. |
commitlens.py |
Data pipeline; handles GitHub API interaction, file filtering, and prompt engineering. |
index.html |
Custom-built, high-fidelity frontend with interactive Git visualizations. |
requirements.txt |
Dependency manifest (requests, gradio, torch, transformers, etc.). |
βοΈ How It Works
- Extraction: The
GitHubClientfetches commit metadata and raw patches. - Filtering: Files are filtered by extension (keeping source code, ignoring binaries/locks) and sorted by impact.
- Mellum Analysis: The pipeline builds structured prompts containing "Before", "After", and "Diff" blocks. Mellum 2 generates concise summaries for each file.
- Groq Synthesis: Per-file summaries are batched and sent to Groq for a final structured Markdown report including a "Commit Overview" and "Key Takeaways".
π Setup & Usage
Local Development
Install dependencies:
pip install -r requirements.txtSet Environment Variables:
export GROQ_API_KEY="your_groq_api_key"Run the application:
python app.py
CLI Mode
You can also run the extraction pipeline directly:
python commitlens.py <github_repo_url> --token <optional_pat> --print-prompts
π License
MIT