File size: 14,870 Bytes
7b4f5dd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
---
title: CodeSentry
emoji: πŸ›‘οΈ
colorFrom: indigo
colorTo: purple
sdk: docker
pinned: false
license: mit
app_port: 7860
---

# πŸ›‘οΈ CodeSentry

> **CodeSentry** is an enterprise-grade, agentic AI security and performance copilot designed to seamlessly analyze codebases, identify critical vulnerabilities, and generate intelligent, ready-to-merge patches β€” with built-in CUDA β†’ ROCm migration guidance for AMD hardware.

Built with a strict **Zero Data Retention (ZDR)** architecture, CodeSentry ensures that your proprietary code never leaves your secure environment or gets used for model training, making it perfect for highly sensitive, enterprise-scale environments.

---

## ✨ Key Features

- **🧠 Agentic Pipeline:** CodeSentry uses a multi-agent orchestration architecture:
  - **Security Agent:** Combines lightning-fast static analysis with deep semantic LLM reasoning to catch complex vulnerabilities (e.g., prompt injections, hardcoded secrets, unsafe deserialization).
  - **Performance Agent:** Specifically tailored to analyze ML/AI logic. It detects GPU memory bottlenecks, inefficient loop structures, and suggests hardware-native optimizations (like `bfloat16` for AMD MI300X).
  - **Fix Agent:** Automatically generates unified Git-style diffs and line-by-line patch recommendations for every finding.
  - **AMD Migration Advisor:** Scans for 10 categories of CUDA-specific patterns (nvidia-smi, CUDA_VISIBLE_DEVICES, BitsAndBytes, cuDNN, FP16 usage, etc.) and provides actionable ROCm/HIP migration guidance with a 0–100 AMD Compatibility Score.
- **⚑ AMD MI300X Live Metrics:** Real-time GPU performance monitoring (utilization, VRAM, temperature, power draw, inference speed) streamed to the dashboard during every scan via SSE. Uses `rocm-smi` on AMD hardware, with simulated fallback for development environments.
- **πŸ”’ Zero Data Retention (ZDR):** Every analysis session generates a unique cryptographic Privacy Certificate. The backend actively blocks outgoing network calls during the scan and wipes all data from memory the millisecond the scan completes.
- **⚑ Real-Time Streaming:** The analysis engine uses Server-Sent Events (SSE) to stream findings to the frontend instantaneously, creating a highly responsive "live" dashboard experience.
- **πŸ“‹ One-Click Reporting:** Export full `SECURITY_REPORT.md` documents, structured JSON audit logs, copy-paste ready GitHub Pull Request descriptions, and `AMD_MIGRATION_GUIDE.md` reports.

---

## πŸ—οΈ System Architecture

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                      CODESENTRY FRONTEND                         β”‚
β”‚           React + Vite | Cyberpunk Terminal Aesthetic            β”‚
β”‚  LandingPage β†’ AnalysisView (SSE Live Feed) β†’ ReportView        β”‚
β”‚         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”‚
β”‚         β”‚ AMD MI300X Live   β”‚  β”‚ AMD Migration Advisor  β”‚       β”‚
β”‚         β”‚ Metrics Card      β”‚  β”‚ Panel + Score Circle   β”‚       β”‚
β”‚         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚  SSE (Server-Sent Events) + REST
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                       CODESENTRY BACKEND                         β”‚
β”‚                        FastAPI / Python                          β”‚
β”‚                                                                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  Security   β”‚  β”‚  Performance     β”‚  β”‚    Fix Agent       β”‚  β”‚
β”‚  β”‚  Agent      β”‚  β”‚  Agent           β”‚  β”‚ (patches + diffs)  β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚         β”‚          β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”            β”‚              β”‚
β”‚         β”‚          β”‚ AMD Migration  β”‚            β”‚              β”‚
β”‚         β”‚          β”‚ Advisor (10    β”‚            β”‚              β”‚
β”‚         β”‚          β”‚ CUDA patterns) β”‚            β”‚              β”‚
β”‚         β”‚          β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β”‚              β”‚
β”‚         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Ίβ”‚β—„β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β”‚
β”‚                     β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”                              β”‚
β”‚                     β”‚ Orchestratorβ”‚                              β”‚
β”‚                     β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜                              β”‚
β”‚                            β”‚                                     β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚ Privacy Guard β”‚ Session Store β”‚ AMD Metrics β”‚ Code Parser β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                            β”‚                                     β”‚
β”‚                     β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”                              β”‚
β”‚                     β”‚  vLLM Serverβ”‚ (Qwen2.5-Coder-32B)         β”‚
β”‚                     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

The project is divided into two main components:

### 1. The Backend (`/codesentry-backend`)
A high-performance **FastAPI** server that acts as the orchestrator. 
- Ingests code via GitHub URLs, Hugging Face Spaces URLs, Zip files, or raw code snippets.
- Manages the stateful analysis session and memory lifecycle.
- Runs **AMD MI300X live metrics polling** via `rocm-smi` (with simulated fallback for dev environments).
- Runs the **AMD Migration Advisor** to detect CUDA-specific patterns and calculate an AMD Compatibility Score.
- Connects to an LLM endpoint (optimized for local deployment via `vLLM` on AMD hardware, using Qwen2.5-Coder-32B) to power the intelligent agents.

### 2. The Frontend (`/codesentry-frontend`)
A modern **React + Vite** dashboard built with a premium, cyberpunk-inspired terminal aesthetic.
- Connects to the backend via SSE for live streaming.
- Features the **AMD MI300X Live Performance Card** in the Analysis View β€” 6 GPU metrics updated every 2 seconds.
- Features the **AMD ROCm Migration Advisor Panel** in the Report View β€” animated score circle, collapsible findings, and one-click `AMD_MIGRATION_GUIDE.md` export.
- Dynamic data visualization, animated severity charts, and side-by-side Before/After code diffing for AI-generated fixes.

---

## πŸ”΄ AMD-Specific Features

### Live Hardware Metrics (Analysis View)
During every scan, CodeSentry polls the AMD MI300X GPU via `rocm-smi` and streams live metrics to the dashboard:

| Metric | Description |
|--------|-------------|
| GPU Utilization | Current compute load (%) |
| VRAM Used | GB used / 192 GB total with visual bar |
| Memory Bandwidth | TB/s data throughput |
| Temperature | GPU edge temperature (Β°C) |
| Power Draw | Current wattage consumption (W) |
| Inference Speed | LLM tokens per second |

> On development machines without AMD hardware, the card displays realistic simulated values.

### CUDA β†’ ROCm Migration Advisor (Report View)
The Migration Advisor scans code for 10 categories of CUDA-specific patterns:

| ID | Severity | What It Detects |
|----|----------|-----------------|
| AMD_M01 | Low | `torch.cuda.is_available()` β€” CUDA device check |
| AMD_M02 | **Critical** | `nvidia-smi` β€” NVIDIA-only CLI tool |
| AMD_M03 | High | `CUDA_VISIBLE_DEVICES` β€” CUDA env variable |
| AMD_M04 | High | `torch.cuda.amp.autocast/GradScaler` β€” Legacy CUDA AMP |
| AMD_M05 | Medium | `.half()` / `torch.float16` β€” FP16 suboptimal on MI300X |
| AMD_M06 | Medium | `torch.backends.cudnn.*` β€” cuDNN configuration |
| AMD_M07 | High | `import flash_attn` β€” CUDA-only Flash Attention |
| AMD_M08 | Low | `torch.cuda.memory_allocated()` β€” CUDA memory profiling |
| AMD_M09 | Low | `device = 'cuda'` β€” Hardcoded device string |
| AMD_M10 | **Critical** | `BitsAndBytesConfig` β€” CUDA-only quantization |

**Compatibility Scoring:**
```
β‰₯ 90% β†’ "Fully ROCm Ready" (green)
β‰₯ 70% β†’ "Mostly Compatible" (yellow)  
β‰₯ 50% β†’ "Needs Migration Work" (orange)
< 50% β†’ "CUDA-Specific Codebase" (red)
```

---

## πŸ’‘ How It Works (An Example Workflow)

To understand CodeSentry, imagine you have a Python scraping script that takes user input and feeds it into an LLM.

1. **Initiate Scan:** You paste the GitHub or Hugging Face Space URL of the script into the CodeSentry dashboard.
2. **Live GPU Monitoring:** The AMD MI300X Live Performance card immediately starts showing real-time GPU utilization, VRAM usage, temperature, and inference speed.
3. **Security Sweep:** The Security Agent immediately flags `cli.py:61` for a **Prompt Injection** (CWE-74) vulnerability because it detects raw user input being passed to the model without sanitization.
4. **Performance Sweep:** The Performance Agent notices the code is loading a large transformer model inside a loop. It flags this and estimates you are wasting significant inference time.
5. **AMD Migration Scan:** The Migration Advisor detects `nvidia-smi` calls and `CUDA_VISIBLE_DEVICES` usage, calculating an AMD Compatibility Score and suggesting `rocm-smi` and `HIP_VISIBLE_DEVICES` replacements.
6. **Fix Generation:** The Fix Agent takes these findings and writes a patch. It refactors the prompt injection to use a parameterized template and hoists the model initialization outside the loop.
7. **Review:** You view the dashboard. The findings are categorized by severity. You click on the Prompt Injection finding, and an AI-Generated Fix panel opens showing exactly what lines to change. The AMD Migration Panel shows your compatibility score with collapsible fix guidance.
8. **Export:** You click "Copy PR Description" and paste a perfectly formatted summary of the fixes directly into your GitHub Pull Request. You also export the `AMD_MIGRATION_GUIDE.md` for your DevOps team.

---

## πŸš€ Installation & Setup

### Prerequisites
- Node.js (v20.19+ or v22.12+)
- Python (v3.10+)
- An API Key for your LLM provider (e.g., Groq) if not running a completely local vLLM instance.

### 1. Backend Setup

Open a terminal and navigate to the backend directory:

```bash
cd codesentry-backend

# Create and activate a virtual environment
python -m venv venv
# On Windows:
venv\Scripts\activate
# On Mac/Linux:
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Configure Environment Variables
# Create a .env file based on the example and add your LLM_API_KEY
cp .env.example .env

# Run the backend server
uvicorn main:app --reload --port 8000
```
*The backend will now be running on `http://127.0.0.1:8000`.*

### 2. Frontend Setup

Open a second terminal and navigate to the frontend directory:

```bash
cd codesentry-frontend

# Install dependencies
npm install

# Ensure VITE_MOCK_MODE is set to false to connect to the live backend
echo "VITE_MOCK_MODE=false" > .env

# Run the development server
npm run dev
```
*The dashboard will be available at `http://127.0.0.1:5173`.*

---

## βš™οΈ Environment Variables

| Variable | Default | Description |
|---|---|---|
| `VLLM_BASE_URL` | `http://localhost:8080/v1` | vLLM OpenAI-compatible endpoint |
| `MODEL_NAME` | `Qwen/Qwen2.5-Coder-32B-Instruct` | Model served by vLLM |
| `USE_LLM` | `true` | Set `false` for static-only mode (CI) |
| `PORT` | `8000` | CodeSentry API port |
| `CORS_ORIGINS` | `*` | Allowed frontend origins |
| `ZDR_SIGNING_KEY` | (dev default) | HMAC key for certificates β€” **change in production** |
| `GROQ_API_KEY` | β€” | Groq cloud API key (alternative to local vLLM) |
| `VITE_MOCK_MODE` | `false` | Frontend: use mock data instead of live backend |
| `VITE_API_URL` | `http://localhost:8000` | Frontend: backend base URL |

---

## πŸ“Š SSE Event Types

| Event | Description |
|-------|-------------|
| `scan_started` | Scan session created, ID returned |
| `agent_start` | An agent begins (security / performance / fix) |
| `finding` | A security or performance vulnerability found |
| `fix_ready` | A fix patch generated for a specific finding |
| `amd_metrics` | Live AMD MI300X GPU metrics snapshot (every 2s) |
| `amd_migration_finding` | A CUDA β†’ ROCm migration issue detected |
| `amd_migration_summary` | Compatibility score and summary |
| `complete` | Full analysis finished with summary + certificates |
| `error` | An error occurred during analysis |

---

## πŸ“¦ Export Formats

| Format | Description |
|--------|-------------|
| πŸ“„ **JSON Report** | Machine-readable full report with all findings and fixes |
| πŸ“ **SECURITY_REPORT.md** | Human-readable markdown security report |
| πŸ“‹ **Copy PR Description** | GitHub Pull Request description copied to clipboard |
| πŸ”΄ **AMD_MIGRATION_GUIDE.md** | AMD ROCm migration guide with score, findings, and fixes |

---

## πŸ” Built for the AMD Hackathon

CodeSentry was specifically designed to showcase the power of **Agentic AI** running on high-performance AMD MI300X compute hardware. By combining a suite of specialized agents with real-time GPU monitoring and CUDA β†’ ROCm migration guidance, we shift the paradigm of static code analysis from "reporting problems" to "actively writing solutions."

**Zero Data Retention. 100% Agentic. AMD-Optimized. Enterprise Ready.**