Update README.md
Browse files
README.md
CHANGED
|
@@ -10,10 +10,10 @@ tags:
|
|
| 10 |
- reasoning
|
| 11 |
- llm
|
| 12 |
pipeline_tag: text-generation
|
| 13 |
-
base_model: Qwen/
|
| 14 |
---
|
| 15 |
|
| 16 |
-
# VulnLLM-R-
|
| 17 |
|
| 18 |
**VulnLLM-R** is the first specialized **reasoning** Large Language Model designed specifically for software vulnerability detection.
|
| 19 |
|
|
@@ -21,13 +21,13 @@ Unlike traditional static analysis tools (like CodeQL) or small LLMs that rely o
|
|
| 21 |
|
| 22 |
## π Quick Links
|
| 23 |
* **Paper:** [arXiv:2512.07533](https://arxiv.org/abs/2512.07533)
|
| 24 |
-
* **Code & Data:** [GitHub
|
| 25 |
-
* **Demo:** [
|
| 26 |
|
| 27 |
## π‘ Key Features
|
| 28 |
* **Reasoning-Based Detection:** Does not just classify code; it generates a "Chain-of-Thought" to analyze *why* a vulnerability exists.
|
| 29 |
* **Superior Accuracy:** Outperforms commercial giants (like Claude-3.7-Sonnet, o3-mini) and industry-standard tools (CodeQL, AFL++) on key benchmarks.
|
| 30 |
-
* **Efficiency:** Achieves SOTA performance with only **
|
| 31 |
* **Broad Coverage:** Trained and tested on C, C++, Python, and Java (zero-shot generalization).
|
| 32 |
|
| 33 |
## π Quick Start
|
|
@@ -36,7 +36,7 @@ Unlike traditional static analysis tools (like CodeQL) or small LLMs that rely o
|
|
| 36 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 37 |
import torch
|
| 38 |
|
| 39 |
-
model_name = "UCSB-SURFI/VulnLLM-R-
|
| 40 |
|
| 41 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| 42 |
model = AutoModelForCausalLM.from_pretrained(
|
|
@@ -83,7 +83,7 @@ print(response)
|
|
| 83 |
|
| 84 |
## π Performance
|
| 85 |
|
| 86 |
-
VulnLLM-R-
|
| 87 |
|
| 88 |
<img width="600" alt="model_size_vs_f1_scatter_01" src="https://github.com/user-attachments/assets/fc9e6942-14f8-4f34-8229-74596b05c7c5" />
|
| 89 |
|
|
|
|
| 10 |
- reasoning
|
| 11 |
- llm
|
| 12 |
pipeline_tag: text-generation
|
| 13 |
+
base_model: Qwen/Qwen2.5-7B-Instruct
|
| 14 |
---
|
| 15 |
|
| 16 |
+
# VulnLLM-R-7B: Specialized Reasoning LLM for Vulnerability Detection
|
| 17 |
|
| 18 |
**VulnLLM-R** is the first specialized **reasoning** Large Language Model designed specifically for software vulnerability detection.
|
| 19 |
|
|
|
|
| 21 |
|
| 22 |
## π Quick Links
|
| 23 |
* **Paper:** [arXiv:2512.07533](https://arxiv.org/abs/2512.07533)
|
| 24 |
+
* **Code & Data:** [GitHub](https://github.com/ucsb-mlsec/VulnLLM-R)
|
| 25 |
+
* **Demo:** [Web demo](https://huggingface.co/spaces/UCSB-SURFI/VulnLLM-R)
|
| 26 |
|
| 27 |
## π‘ Key Features
|
| 28 |
* **Reasoning-Based Detection:** Does not just classify code; it generates a "Chain-of-Thought" to analyze *why* a vulnerability exists.
|
| 29 |
* **Superior Accuracy:** Outperforms commercial giants (like Claude-3.7-Sonnet, o3-mini) and industry-standard tools (CodeQL, AFL++) on key benchmarks.
|
| 30 |
+
* **Efficiency:** Achieves SOTA performance with only **7B parameters**, making it 30x smaller and significantly faster than general-purpose reasoning models.
|
| 31 |
* **Broad Coverage:** Trained and tested on C, C++, Python, and Java (zero-shot generalization).
|
| 32 |
|
| 33 |
## π Quick Start
|
|
|
|
| 36 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 37 |
import torch
|
| 38 |
|
| 39 |
+
model_name = "UCSB-SURFI/VulnLLM-R-7B"
|
| 40 |
|
| 41 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| 42 |
model = AutoModelForCausalLM.from_pretrained(
|
|
|
|
| 83 |
|
| 84 |
## π Performance
|
| 85 |
|
| 86 |
+
VulnLLM-R-7B achieves state-of-the-art results on benchmarks including PrimeVul, Juliet 1.3, and ARVO.
|
| 87 |
|
| 88 |
<img width="600" alt="model_size_vs_f1_scatter_01" src="https://github.com/user-attachments/assets/fc9e6942-14f8-4f34-8229-74596b05c7c5" />
|
| 89 |
|