ayshajavd commited on
Commit
4aeba64
Β·
verified Β·
1 Parent(s): 7336b37

Update README with v2 features and metrics

Browse files
Files changed (1) hide show
  1. README.md +33 -7
README.md CHANGED
@@ -18,13 +18,31 @@ tags:
18
  short_description: AI-powered code vulnerability detection with OWASP mapping
19
  ---
20
 
21
- # πŸ”’ Code Security Risk Analyzer
22
 
23
  AI-powered multi-label vulnerability detection across **30 CWE categories** mapped to **OWASP Top 10 2021**. Supports Python, JavaScript, Java, C, C++, PHP, and Go.
24
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
  ## Features
26
- - **Detection Model:** [GraphCodeBERT classifier](https://huggingface.co/ayshajavd/graphcodebert-vuln-classifier) trained on 175K+ labeled code samples
27
- - **Fix Generator:** [CodeT5+](https://huggingface.co/ayshajavd/codet5p-vuln-fixer) fine-tuned to suggest secure code replacements
28
  - **Structured Reports:** CWE ID, OWASP category, severity score, exploit likelihood, plain English explanation
29
  - **Attack Chain Analysis:** Multi-vulnerability chaining analysis
30
  - **REST API:** JSON endpoint for integration into CI/CD pipelines
@@ -35,10 +53,18 @@ AI-powered multi-label vulnerability detection across **30 CWE categories** mapp
35
  from gradio_client import Client
36
 
37
  client = Client("ayshajavd/code-security-analyzer")
38
- report = client.predict(code="your code here", api_name="/get_json_report")
 
 
 
 
 
39
  ```
40
 
41
  ## Models & Dataset
42
- - [graphcodebert-vuln-classifier](https://huggingface.co/ayshajavd/graphcodebert-vuln-classifier)
43
- - [codet5p-vuln-fixer](https://huggingface.co/ayshajavd/codet5p-vuln-fixer)
44
- - [code-security-vulnerability-dataset](https://huggingface.co/datasets/ayshajavd/code-security-vulnerability-dataset)
 
 
 
 
18
  short_description: AI-powered code vulnerability detection with OWASP mapping
19
  ---
20
 
21
+ # πŸ”’ Code Security Risk Analyzer v2
22
 
23
  AI-powered multi-label vulnerability detection across **30 CWE categories** mapped to **OWASP Top 10 2021**. Supports Python, JavaScript, Java, C, C++, PHP, and Go.
24
 
25
+ ## v2 Improvements
26
+ - **Per-class threshold optimization** β€” each CWE has its own optimal detection threshold (not global 0.3)
27
+ - **Temperature-calibrated probabilities** β€” confidence scores are meaningful (0.8 β‰ˆ 80% true positive rate)
28
+ - **CWE-aware fix generation** β€” fixer model knows *what* vulnerability to fix
29
+ - **3.7x larger fixer model** β€” CodeT5+ 220M (was flan-t5-small 60M)
30
+ - **Asymmetric Loss training** β€” handles 90% safe class imbalance
31
+
32
+ ## Model Performance
33
+
34
+ | Model | Metric | Score |
35
+ |-------|--------|-------|
36
+ | **Classifier** (GraphCodeBERT 125M) | Macro F1 | **0.476** (+311% vs baseline) |
37
+ | | Weighted F1 | **0.945** |
38
+ | | Safe Detection F1 | **0.982** |
39
+ | **Fixer** (CodeT5+ 220M) | BLEU | **81.0** |
40
+ | | ROUGE-L | **0.788** |
41
+ | | Eval Loss | **0.175** (3.1x better than v1) |
42
+
43
  ## Features
44
+ - **Detection Model:** [GraphCodeBERT classifier](https://huggingface.co/ayshajavd/graphcodebert-vuln-classifier) β€” 125M params, two-phase training with ASL loss
45
+ - **Fix Generator:** [CodeT5+ 220M](https://huggingface.co/ayshajavd/codet5p-vuln-fixer) β€” CWE-aware input format, beam search generation
46
  - **Structured Reports:** CWE ID, OWASP category, severity score, exploit likelihood, plain English explanation
47
  - **Attack Chain Analysis:** Multi-vulnerability chaining analysis
48
  - **REST API:** JSON endpoint for integration into CI/CD pipelines
 
53
  from gradio_client import Client
54
 
55
  client = Client("ayshajavd/code-security-analyzer")
56
+
57
+ # Get markdown report
58
+ report = client.predict(code="your code here", api_name="/analyze")
59
+
60
+ # Get structured JSON report
61
+ json_report = client.predict(code="your code here", api_name="/get_json_report")
62
  ```
63
 
64
  ## Models & Dataset
65
+ - [graphcodebert-vuln-classifier](https://huggingface.co/ayshajavd/graphcodebert-vuln-classifier) β€” Multi-label CWE detection
66
+ - [codet5p-vuln-fixer](https://huggingface.co/ayshajavd/codet5p-vuln-fixer) β€” Vulnerability fix generation
67
+ - [code-security-vulnerability-dataset](https://huggingface.co/datasets/ayshajavd/code-security-vulnerability-dataset) β€” 175K labeled samples
68
+
69
+ ## Training Notebooks
70
+ All training code: [vuln-classifier-training-notebooks](https://huggingface.co/ayshajavd/vuln-classifier-training-notebooks)