Spaces:
Running
Running
File size: 14,870 Bytes
7b4f5dd | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 | ---
title: CodeSentry
emoji: π‘οΈ
colorFrom: indigo
colorTo: purple
sdk: docker
pinned: false
license: mit
app_port: 7860
---
# π‘οΈ CodeSentry
> **CodeSentry** is an enterprise-grade, agentic AI security and performance copilot designed to seamlessly analyze codebases, identify critical vulnerabilities, and generate intelligent, ready-to-merge patches β with built-in CUDA β ROCm migration guidance for AMD hardware.
Built with a strict **Zero Data Retention (ZDR)** architecture, CodeSentry ensures that your proprietary code never leaves your secure environment or gets used for model training, making it perfect for highly sensitive, enterprise-scale environments.
---
## β¨ Key Features
- **π§ Agentic Pipeline:** CodeSentry uses a multi-agent orchestration architecture:
- **Security Agent:** Combines lightning-fast static analysis with deep semantic LLM reasoning to catch complex vulnerabilities (e.g., prompt injections, hardcoded secrets, unsafe deserialization).
- **Performance Agent:** Specifically tailored to analyze ML/AI logic. It detects GPU memory bottlenecks, inefficient loop structures, and suggests hardware-native optimizations (like `bfloat16` for AMD MI300X).
- **Fix Agent:** Automatically generates unified Git-style diffs and line-by-line patch recommendations for every finding.
- **AMD Migration Advisor:** Scans for 10 categories of CUDA-specific patterns (nvidia-smi, CUDA_VISIBLE_DEVICES, BitsAndBytes, cuDNN, FP16 usage, etc.) and provides actionable ROCm/HIP migration guidance with a 0β100 AMD Compatibility Score.
- **β‘ AMD MI300X Live Metrics:** Real-time GPU performance monitoring (utilization, VRAM, temperature, power draw, inference speed) streamed to the dashboard during every scan via SSE. Uses `rocm-smi` on AMD hardware, with simulated fallback for development environments.
- **π Zero Data Retention (ZDR):** Every analysis session generates a unique cryptographic Privacy Certificate. The backend actively blocks outgoing network calls during the scan and wipes all data from memory the millisecond the scan completes.
- **β‘ Real-Time Streaming:** The analysis engine uses Server-Sent Events (SSE) to stream findings to the frontend instantaneously, creating a highly responsive "live" dashboard experience.
- **π One-Click Reporting:** Export full `SECURITY_REPORT.md` documents, structured JSON audit logs, copy-paste ready GitHub Pull Request descriptions, and `AMD_MIGRATION_GUIDE.md` reports.
---
## ποΈ System Architecture
```
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CODESENTRY FRONTEND β
β React + Vite | Cyberpunk Terminal Aesthetic β
β LandingPage β AnalysisView (SSE Live Feed) β ReportView β
β βββββββββββββββββββββ ββββββββββββββββββββββββββ β
β β AMD MI300X Live β β AMD Migration Advisor β β
β β Metrics Card β β Panel + Score Circle β β
β βββββββββββββββββββββ ββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β SSE (Server-Sent Events) + REST
βββββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββ
β CODESENTRY BACKEND β
β FastAPI / Python β
β β
β βββββββββββββββ ββββββββββββββββββββ ββββββββββββββββββββββ β
β β Security β β Performance β β Fix Agent β β
β β Agent β β Agent β β (patches + diffs) β β
β ββββββββ¬βββββββ ββββββββββ¬ββββββββββ ββββββββββ¬ββββββββββββ β
β β βββββββββΌβββββββββ β β
β β β AMD Migration β β β
β β β Advisor (10 β β β
β β β CUDA patterns) β β β
β β βββββββββ¬βββββββββ β β
β βββββββββββββββββββΊβββββββββββββββββββββββ β
β ββββββββΌβββββββ β
β β Orchestratorβ β
β ββββββββ¬βββββββ β
β β β
β ββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββ β
β β Privacy Guard β Session Store β AMD Metrics β Code Parser β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β ββββββββΌβββββββ β
β β vLLM Serverβ (Qwen2.5-Coder-32B) β
β βββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
The project is divided into two main components:
### 1. The Backend (`/codesentry-backend`)
A high-performance **FastAPI** server that acts as the orchestrator.
- Ingests code via GitHub URLs, Hugging Face Spaces URLs, Zip files, or raw code snippets.
- Manages the stateful analysis session and memory lifecycle.
- Runs **AMD MI300X live metrics polling** via `rocm-smi` (with simulated fallback for dev environments).
- Runs the **AMD Migration Advisor** to detect CUDA-specific patterns and calculate an AMD Compatibility Score.
- Connects to an LLM endpoint (optimized for local deployment via `vLLM` on AMD hardware, using Qwen2.5-Coder-32B) to power the intelligent agents.
### 2. The Frontend (`/codesentry-frontend`)
A modern **React + Vite** dashboard built with a premium, cyberpunk-inspired terminal aesthetic.
- Connects to the backend via SSE for live streaming.
- Features the **AMD MI300X Live Performance Card** in the Analysis View β 6 GPU metrics updated every 2 seconds.
- Features the **AMD ROCm Migration Advisor Panel** in the Report View β animated score circle, collapsible findings, and one-click `AMD_MIGRATION_GUIDE.md` export.
- Dynamic data visualization, animated severity charts, and side-by-side Before/After code diffing for AI-generated fixes.
---
## π΄ AMD-Specific Features
### Live Hardware Metrics (Analysis View)
During every scan, CodeSentry polls the AMD MI300X GPU via `rocm-smi` and streams live metrics to the dashboard:
| Metric | Description |
|--------|-------------|
| GPU Utilization | Current compute load (%) |
| VRAM Used | GB used / 192 GB total with visual bar |
| Memory Bandwidth | TB/s data throughput |
| Temperature | GPU edge temperature (Β°C) |
| Power Draw | Current wattage consumption (W) |
| Inference Speed | LLM tokens per second |
> On development machines without AMD hardware, the card displays realistic simulated values.
### CUDA β ROCm Migration Advisor (Report View)
The Migration Advisor scans code for 10 categories of CUDA-specific patterns:
| ID | Severity | What It Detects |
|----|----------|-----------------|
| AMD_M01 | Low | `torch.cuda.is_available()` β CUDA device check |
| AMD_M02 | **Critical** | `nvidia-smi` β NVIDIA-only CLI tool |
| AMD_M03 | High | `CUDA_VISIBLE_DEVICES` β CUDA env variable |
| AMD_M04 | High | `torch.cuda.amp.autocast/GradScaler` β Legacy CUDA AMP |
| AMD_M05 | Medium | `.half()` / `torch.float16` β FP16 suboptimal on MI300X |
| AMD_M06 | Medium | `torch.backends.cudnn.*` β cuDNN configuration |
| AMD_M07 | High | `import flash_attn` β CUDA-only Flash Attention |
| AMD_M08 | Low | `torch.cuda.memory_allocated()` β CUDA memory profiling |
| AMD_M09 | Low | `device = 'cuda'` β Hardcoded device string |
| AMD_M10 | **Critical** | `BitsAndBytesConfig` β CUDA-only quantization |
**Compatibility Scoring:**
```
β₯ 90% β "Fully ROCm Ready" (green)
β₯ 70% β "Mostly Compatible" (yellow)
β₯ 50% β "Needs Migration Work" (orange)
< 50% β "CUDA-Specific Codebase" (red)
```
---
## π‘ How It Works (An Example Workflow)
To understand CodeSentry, imagine you have a Python scraping script that takes user input and feeds it into an LLM.
1. **Initiate Scan:** You paste the GitHub or Hugging Face Space URL of the script into the CodeSentry dashboard.
2. **Live GPU Monitoring:** The AMD MI300X Live Performance card immediately starts showing real-time GPU utilization, VRAM usage, temperature, and inference speed.
3. **Security Sweep:** The Security Agent immediately flags `cli.py:61` for a **Prompt Injection** (CWE-74) vulnerability because it detects raw user input being passed to the model without sanitization.
4. **Performance Sweep:** The Performance Agent notices the code is loading a large transformer model inside a loop. It flags this and estimates you are wasting significant inference time.
5. **AMD Migration Scan:** The Migration Advisor detects `nvidia-smi` calls and `CUDA_VISIBLE_DEVICES` usage, calculating an AMD Compatibility Score and suggesting `rocm-smi` and `HIP_VISIBLE_DEVICES` replacements.
6. **Fix Generation:** The Fix Agent takes these findings and writes a patch. It refactors the prompt injection to use a parameterized template and hoists the model initialization outside the loop.
7. **Review:** You view the dashboard. The findings are categorized by severity. You click on the Prompt Injection finding, and an AI-Generated Fix panel opens showing exactly what lines to change. The AMD Migration Panel shows your compatibility score with collapsible fix guidance.
8. **Export:** You click "Copy PR Description" and paste a perfectly formatted summary of the fixes directly into your GitHub Pull Request. You also export the `AMD_MIGRATION_GUIDE.md` for your DevOps team.
---
## π Installation & Setup
### Prerequisites
- Node.js (v20.19+ or v22.12+)
- Python (v3.10+)
- An API Key for your LLM provider (e.g., Groq) if not running a completely local vLLM instance.
### 1. Backend Setup
Open a terminal and navigate to the backend directory:
```bash
cd codesentry-backend
# Create and activate a virtual environment
python -m venv venv
# On Windows:
venv\Scripts\activate
# On Mac/Linux:
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Configure Environment Variables
# Create a .env file based on the example and add your LLM_API_KEY
cp .env.example .env
# Run the backend server
uvicorn main:app --reload --port 8000
```
*The backend will now be running on `http://127.0.0.1:8000`.*
### 2. Frontend Setup
Open a second terminal and navigate to the frontend directory:
```bash
cd codesentry-frontend
# Install dependencies
npm install
# Ensure VITE_MOCK_MODE is set to false to connect to the live backend
echo "VITE_MOCK_MODE=false" > .env
# Run the development server
npm run dev
```
*The dashboard will be available at `http://127.0.0.1:5173`.*
---
## βοΈ Environment Variables
| Variable | Default | Description |
|---|---|---|
| `VLLM_BASE_URL` | `http://localhost:8080/v1` | vLLM OpenAI-compatible endpoint |
| `MODEL_NAME` | `Qwen/Qwen2.5-Coder-32B-Instruct` | Model served by vLLM |
| `USE_LLM` | `true` | Set `false` for static-only mode (CI) |
| `PORT` | `8000` | CodeSentry API port |
| `CORS_ORIGINS` | `*` | Allowed frontend origins |
| `ZDR_SIGNING_KEY` | (dev default) | HMAC key for certificates β **change in production** |
| `GROQ_API_KEY` | β | Groq cloud API key (alternative to local vLLM) |
| `VITE_MOCK_MODE` | `false` | Frontend: use mock data instead of live backend |
| `VITE_API_URL` | `http://localhost:8000` | Frontend: backend base URL |
---
## π SSE Event Types
| Event | Description |
|-------|-------------|
| `scan_started` | Scan session created, ID returned |
| `agent_start` | An agent begins (security / performance / fix) |
| `finding` | A security or performance vulnerability found |
| `fix_ready` | A fix patch generated for a specific finding |
| `amd_metrics` | Live AMD MI300X GPU metrics snapshot (every 2s) |
| `amd_migration_finding` | A CUDA β ROCm migration issue detected |
| `amd_migration_summary` | Compatibility score and summary |
| `complete` | Full analysis finished with summary + certificates |
| `error` | An error occurred during analysis |
---
## π¦ Export Formats
| Format | Description |
|--------|-------------|
| π **JSON Report** | Machine-readable full report with all findings and fixes |
| π **SECURITY_REPORT.md** | Human-readable markdown security report |
| π **Copy PR Description** | GitHub Pull Request description copied to clipboard |
| π΄ **AMD_MIGRATION_GUIDE.md** | AMD ROCm migration guide with score, findings, and fixes |
---
## π Built for the AMD Hackathon
CodeSentry was specifically designed to showcase the power of **Agentic AI** running on high-performance AMD MI300X compute hardware. By combining a suite of specialized agents with real-time GPU monitoring and CUDA β ROCm migration guidance, we shift the paradigm of static code analysis from "reporting problems" to "actively writing solutions."
**Zero Data Retention. 100% Agentic. AMD-Optimized. Enterprise Ready.**
|