Spaces:

ayshajavd
/

code-security-analyzer

Running

App Files Files Community

ayshajavd commited on 22 days ago

Commit

4aeba64

verified ·

1 Parent(s): 7336b37

Update README with v2 features and metrics

Browse files

Files changed (1) hide show

README.md +33 -7

README.md CHANGED Viewed

@@ -18,13 +18,31 @@ tags:
 short_description: AI-powered code vulnerability detection with OWASP mapping
 ---
-# 🔒 Code Security Risk Analyzer
 AI-powered multi-label vulnerability detection across **30 CWE categories** mapped to **OWASP Top 10 2021**. Supports Python, JavaScript, Java, C, C++, PHP, and Go.
 ## Features
-- **Detection Model:** [GraphCodeBERT classifier](https://huggingface.co/ayshajavd/graphcodebert-vuln-classifier) trained on 175K+ labeled code samples
-- **Fix Generator:** [CodeT5+](https://huggingface.co/ayshajavd/codet5p-vuln-fixer) fine-tuned to suggest secure code replacements
 - **Structured Reports:** CWE ID, OWASP category, severity score, exploit likelihood, plain English explanation
 - **Attack Chain Analysis:** Multi-vulnerability chaining analysis
 - **REST API:** JSON endpoint for integration into CI/CD pipelines
@@ -35,10 +53,18 @@ AI-powered multi-label vulnerability detection across **30 CWE categories** mapp
 from gradio_client import Client
 client = Client("ayshajavd/code-security-analyzer")
-report = client.predict(code="your code here", api_name="/get_json_report")
 ```
 ## Models & Dataset
-- [graphcodebert-vuln-classifier](https://huggingface.co/ayshajavd/graphcodebert-vuln-classifier)
-- [codet5p-vuln-fixer](https://huggingface.co/ayshajavd/codet5p-vuln-fixer)
-- [code-security-vulnerability-dataset](https://huggingface.co/datasets/ayshajavd/code-security-vulnerability-dataset)

 short_description: AI-powered code vulnerability detection with OWASP mapping
 ---
+# 🔒 Code Security Risk Analyzer v2
 AI-powered multi-label vulnerability detection across **30 CWE categories** mapped to **OWASP Top 10 2021**. Supports Python, JavaScript, Java, C, C++, PHP, and Go.
+## v2 Improvements
+- **Per-class threshold optimization** — each CWE has its own optimal detection threshold (not global 0.3)
+- **Temperature-calibrated probabilities** — confidence scores are meaningful (0.8 ≈ 80% true positive rate)
+- **CWE-aware fix generation** — fixer model knows *what* vulnerability to fix
+- **3.7x larger fixer model** — CodeT5+ 220M (was flan-t5-small 60M)
+- **Asymmetric Loss training** — handles 90% safe class imbalance
+## Model Performance
+| Model | Metric | Score |
+|-------|--------|-------|
+| **Classifier** (GraphCodeBERT 125M) | Macro F1 | **0.476** (+311% vs baseline) |
+| | Weighted F1 | **0.945** |
+| | Safe Detection F1 | **0.982** |
+| **Fixer** (CodeT5+ 220M) | BLEU | **81.0** |
+| | ROUGE-L | **0.788** |
+| | Eval Loss | **0.175** (3.1x better than v1) |
 ## Features
+- **Detection Model:** [GraphCodeBERT classifier](https://huggingface.co/ayshajavd/graphcodebert-vuln-classifier) — 125M params, two-phase training with ASL loss
+- **Fix Generator:** [CodeT5+ 220M](https://huggingface.co/ayshajavd/codet5p-vuln-fixer) — CWE-aware input format, beam search generation
 - **Structured Reports:** CWE ID, OWASP category, severity score, exploit likelihood, plain English explanation
 - **Attack Chain Analysis:** Multi-vulnerability chaining analysis
 - **REST API:** JSON endpoint for integration into CI/CD pipelines
 from gradio_client import Client
 client = Client("ayshajavd/code-security-analyzer")
+# Get markdown report
+report = client.predict(code="your code here", api_name="/analyze")
+# Get structured JSON report
+json_report = client.predict(code="your code here", api_name="/get_json_report")
 ```
 ## Models & Dataset
+- [graphcodebert-vuln-classifier](https://huggingface.co/ayshajavd/graphcodebert-vuln-classifier) — Multi-label CWE detection
+- [codet5p-vuln-fixer](https://huggingface.co/ayshajavd/codet5p-vuln-fixer) — Vulnerability fix generation
+- [code-security-vulnerability-dataset](https://huggingface.co/datasets/ayshajavd/code-security-vulnerability-dataset) — 175K labeled samples
+## Training Notebooks
+All training code: [vuln-classifier-training-notebooks](https://huggingface.co/ayshajavd/vuln-classifier-training-notebooks)