File size: 2,783 Bytes
bd59201 6e9ff24 bd59201 fa93666 bd59201 6e9ff24 bd59201 4aeba64 6e9ff24 4aeba64 6e9ff24 4aeba64 6e9ff24 4aeba64 6e9ff24 4aeba64 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 | ---
title: Code Security Risk Analyzer
emoji: π
colorFrom: red
colorTo: purple
sdk: gradio
sdk_version: 6.13.0
app_file: app.py
pinned: true
license: apache-2.0
tags:
- security
- vulnerability-detection
- owasp
- cwe
- code-analysis
- static-analysis
short_description: AI-powered code vulnerability detection with OWASP mapping
---
# π Code Security Risk Analyzer v2
AI-powered multi-label vulnerability detection across **30 CWE categories** mapped to **OWASP Top 10 2021**. Supports Python, JavaScript, Java, C, C++, PHP, and Go.
## v2 Improvements
- **Per-class threshold optimization** β each CWE has its own optimal detection threshold (not global 0.3)
- **Temperature-calibrated probabilities** β confidence scores are meaningful (0.8 β 80% true positive rate)
- **CWE-aware fix generation** β fixer model knows *what* vulnerability to fix
- **3.7x larger fixer model** β CodeT5+ 220M (was flan-t5-small 60M)
- **Asymmetric Loss training** β handles 90% safe class imbalance
## Model Performance
| Model | Metric | Score |
|-------|--------|-------|
| **Classifier** (GraphCodeBERT 125M) | Macro F1 | **0.476** (+311% vs baseline) |
| | Weighted F1 | **0.945** |
| | Safe Detection F1 | **0.982** |
| **Fixer** (CodeT5+ 220M) | BLEU | **81.0** |
| | ROUGE-L | **0.788** |
| | Eval Loss | **0.175** (3.1x better than v1) |
## Features
- **Detection Model:** [GraphCodeBERT classifier](https://huggingface.co/ayshajavd/graphcodebert-vuln-classifier) β 125M params, two-phase training with ASL loss
- **Fix Generator:** [CodeT5+ 220M](https://huggingface.co/ayshajavd/codet5p-vuln-fixer) β CWE-aware input format, beam search generation
- **Structured Reports:** CWE ID, OWASP category, severity score, exploit likelihood, plain English explanation
- **Attack Chain Analysis:** Multi-vulnerability chaining analysis
- **REST API:** JSON endpoint for integration into CI/CD pipelines
## API Usage
```python
from gradio_client import Client
client = Client("ayshajavd/code-security-analyzer")
# Get markdown report
report = client.predict(code="your code here", api_name="/analyze")
# Get structured JSON report
json_report = client.predict(code="your code here", api_name="/get_json_report")
```
## Models & Dataset
- [graphcodebert-vuln-classifier](https://huggingface.co/ayshajavd/graphcodebert-vuln-classifier) β Multi-label CWE detection
- [codet5p-vuln-fixer](https://huggingface.co/ayshajavd/codet5p-vuln-fixer) β Vulnerability fix generation
- [code-security-vulnerability-dataset](https://huggingface.co/datasets/ayshajavd/code-security-vulnerability-dataset) β 175K labeled samples
## Training Notebooks
All training code: [vuln-classifier-training-notebooks](https://huggingface.co/ayshajavd/vuln-classifier-training-notebooks)
|