| --- |
| title: Code Security Risk Analyzer |
| emoji: π |
| colorFrom: red |
| colorTo: purple |
| sdk: gradio |
| sdk_version: 6.13.0 |
| app_file: app.py |
| pinned: true |
| license: apache-2.0 |
| tags: |
| - security |
| - vulnerability-detection |
| - owasp |
| - cwe |
| - code-analysis |
| - static-analysis |
| short_description: AI-powered code vulnerability detection with OWASP mapping |
| --- |
| |
| # π Code Security Risk Analyzer v2 |
|
|
| AI-powered multi-label vulnerability detection across **30 CWE categories** mapped to **OWASP Top 10 2021**. Supports Python, JavaScript, Java, C, C++, PHP, and Go. |
|
|
| ## v2 Improvements |
| - **Per-class threshold optimization** β each CWE has its own optimal detection threshold (not global 0.3) |
| - **Temperature-calibrated probabilities** β confidence scores are meaningful (0.8 β 80% true positive rate) |
| - **CWE-aware fix generation** β fixer model knows *what* vulnerability to fix |
| - **3.7x larger fixer model** β CodeT5+ 220M (was flan-t5-small 60M) |
| - **Asymmetric Loss training** β handles 90% safe class imbalance |
|
|
| ## Model Performance |
|
|
| | Model | Metric | Score | |
| |-------|--------|-------| |
| | **Classifier** (GraphCodeBERT 125M) | Macro F1 | **0.476** (+311% vs baseline) | |
| | | Weighted F1 | **0.945** | |
| | | Safe Detection F1 | **0.982** | |
| | **Fixer** (CodeT5+ 220M) | BLEU | **81.0** | |
| | | ROUGE-L | **0.788** | |
| | | Eval Loss | **0.175** (3.1x better than v1) | |
|
|
| ## Features |
| - **Detection Model:** [GraphCodeBERT classifier](https://huggingface.co/ayshajavd/graphcodebert-vuln-classifier) β 125M params, two-phase training with ASL loss |
| - **Fix Generator:** [CodeT5+ 220M](https://huggingface.co/ayshajavd/codet5p-vuln-fixer) β CWE-aware input format, beam search generation |
| - **Structured Reports:** CWE ID, OWASP category, severity score, exploit likelihood, plain English explanation |
| - **Attack Chain Analysis:** Multi-vulnerability chaining analysis |
| - **REST API:** JSON endpoint for integration into CI/CD pipelines |
|
|
| ## API Usage |
|
|
| ```python |
| from gradio_client import Client |
| |
| client = Client("ayshajavd/code-security-analyzer") |
| |
| # Get markdown report |
| report = client.predict(code="your code here", api_name="/analyze") |
| |
| # Get structured JSON report |
| json_report = client.predict(code="your code here", api_name="/get_json_report") |
| ``` |
|
|
| ## Models & Dataset |
| - [graphcodebert-vuln-classifier](https://huggingface.co/ayshajavd/graphcodebert-vuln-classifier) β Multi-label CWE detection |
| - [codet5p-vuln-fixer](https://huggingface.co/ayshajavd/codet5p-vuln-fixer) β Vulnerability fix generation |
| - [code-security-vulnerability-dataset](https://huggingface.co/datasets/ayshajavd/code-security-vulnerability-dataset) β 175K labeled samples |
|
|
| ## Training Notebooks |
| All training code: [vuln-classifier-training-notebooks](https://huggingface.co/ayshajavd/vuln-classifier-training-notebooks) |
|
|