Update README.md
Browse files
README.md
CHANGED
|
@@ -1,61 +1,103 @@
|
|
| 1 |
---
|
| 2 |
-
|
|
|
|
|
|
|
| 3 |
license: apache-2.0
|
| 4 |
-
base_model: secmlr/DS-Noisy-N_DS-Clean-N_DS-OSS-N_QWQ-OSS-N_QWQ-Clean-N_QWQ-Noisy-N_Qwen2.5-7B-Instruct_sft
|
| 5 |
tags:
|
| 6 |
-
-
|
| 7 |
-
-
|
| 8 |
-
-
|
| 9 |
-
|
| 10 |
-
-
|
| 11 |
-
|
|
|
|
| 12 |
---
|
| 13 |
|
| 14 |
-
|
| 15 |
-
should probably proofread and complete it, then remove this comment. -->
|
| 16 |
|
| 17 |
-
|
| 18 |
|
| 19 |
-
|
| 20 |
|
| 21 |
-
##
|
|
|
|
|
|
|
|
|
|
| 22 |
|
| 23 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 24 |
|
| 25 |
-
##
|
| 26 |
|
| 27 |
-
|
|
|
|
|
|
|
| 28 |
|
| 29 |
-
|
| 30 |
|
| 31 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
|
| 33 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 |
|
| 35 |
-
|
|
|
|
|
|
|
| 36 |
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
- train_batch_size: 1
|
| 40 |
-
- eval_batch_size: 8
|
| 41 |
-
- seed: 42
|
| 42 |
-
- distributed_type: multi-GPU
|
| 43 |
-
- num_devices: 2
|
| 44 |
-
- gradient_accumulation_steps: 12
|
| 45 |
-
- total_train_batch_size: 24
|
| 46 |
-
- total_eval_batch_size: 16
|
| 47 |
-
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 48 |
-
- lr_scheduler_type: cosine
|
| 49 |
-
- lr_scheduler_warmup_ratio: 0.1
|
| 50 |
-
- num_epochs: 2.0
|
| 51 |
|
| 52 |
-
|
|
|
|
| 53 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 54 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 55 |
|
| 56 |
-
|
|
|
|
|
|
|
| 57 |
|
| 58 |
-
|
| 59 |
-
|
| 60 |
-
-
|
| 61 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
language:
|
| 3 |
+
- en
|
| 4 |
+
- code
|
| 5 |
license: apache-2.0
|
|
|
|
| 6 |
tags:
|
| 7 |
+
- security
|
| 8 |
+
- vulnerability-detection
|
| 9 |
+
- code-analysis
|
| 10 |
+
- reasoning
|
| 11 |
+
- llm
|
| 12 |
+
pipeline_tag: text-generation
|
| 13 |
+
base_model: Qwen/Qwen3-8B-Instruct
|
| 14 |
---
|
| 15 |
|
| 16 |
+
# VulnLLM-R-8B: Specialized Reasoning LLM for Vulnerability Detection
|
|
|
|
| 17 |
|
| 18 |
+
**VulnLLM-R** is the first specialized **reasoning** Large Language Model designed specifically for software vulnerability detection.
|
| 19 |
|
| 20 |
+
Unlike traditional static analysis tools (like CodeQL) or small LLMs that rely on simple pattern matching, VulnLLM-R is trained to **reason step-by-step** about data flow, control flow, and security context. It mimics the thought process of a human security auditor to identify complex logic vulnerabilities with high accuracy.
|
| 21 |
|
| 22 |
+
## ๐ Quick Links
|
| 23 |
+
* **Paper:** [arXiv:2512.07533](https://arxiv.org/abs/2512.07533)
|
| 24 |
+
* **Code & Data:** [GitHub Repository](https://github.com/ucsb-mlsec/VulnLLM-R)
|
| 25 |
+
* **Demo:** [HuggingFace Space / Web Demo](https://huggingface.co/spaces/UCSB-SURFI/VulnLLM-R)
|
| 26 |
|
| 27 |
+
## ๐ก Key Features
|
| 28 |
+
* **Reasoning-Based Detection:** Does not just classify code; it generates a "Chain-of-Thought" to analyze *why* a vulnerability exists.
|
| 29 |
+
* **Superior Accuracy:** Outperforms commercial giants (like Claude-3.7-Sonnet, o3-mini) and industry-standard tools (CodeQL, AFL++) on key benchmarks.
|
| 30 |
+
* **Efficiency:** Achieves SOTA performance with only **8B parameters**, making it 30x smaller and significantly faster than general-purpose reasoning models.
|
| 31 |
+
* **Broad Coverage:** Trained and tested on C, C++, Python, and Java (zero-shot generalization).
|
| 32 |
|
| 33 |
+
## ๐ Quick Start
|
| 34 |
|
| 35 |
+
```python
|
| 36 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 37 |
+
import torch
|
| 38 |
|
| 39 |
+
model_name = "UCSB-SURFI/VulnLLM-R-8B"
|
| 40 |
|
| 41 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| 42 |
+
model = AutoModelForCausalLM.from_pretrained(
|
| 43 |
+
model_name,
|
| 44 |
+
torch_dtype=torch.bfloat16,
|
| 45 |
+
device_map="auto"
|
| 46 |
+
)
|
| 47 |
|
| 48 |
+
# Example Code Snippet
|
| 49 |
+
code_snippet = """
|
| 50 |
+
void vulnerable_function(char *input) {
|
| 51 |
+
char buffer[50];
|
| 52 |
+
strcpy(buffer, input); // Potential buffer overflow
|
| 53 |
+
}
|
| 54 |
+
"""
|
| 55 |
|
| 56 |
+
# Prompt Template (Triggering Reasoning)
|
| 57 |
+
prompt = f"""You are an advanced vulnerability detection model.
|
| 58 |
+
Please analyze the following code step-by-step to determine if it contains a vulnerability.
|
| 59 |
|
| 60 |
+
Code:
|
| 61 |
+
{code_snippet}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 62 |
|
| 63 |
+
Please provide your reasoning followed by the final answer.
|
| 64 |
+
"""
|
| 65 |
|
| 66 |
+
messages = [
|
| 67 |
+
{"role": "user", "content": prompt}
|
| 68 |
+
]
|
| 69 |
+
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
|
| 70 |
+
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
|
| 71 |
|
| 72 |
+
generated_ids = model.generate(
|
| 73 |
+
model_inputs.input_ids,
|
| 74 |
+
max_new_tokens=512
|
| 75 |
+
)
|
| 76 |
+
generated_ids = [
|
| 77 |
+
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
|
| 78 |
+
]
|
| 79 |
|
| 80 |
+
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
|
| 81 |
+
print(response)
|
| 82 |
+
```
|
| 83 |
|
| 84 |
+
## ๐ Performance
|
| 85 |
+
|
| 86 |
+
VulnLLM-R-8B achieves state-of-the-art results on benchmarks including PrimeVul, Juliet 1.3, and ARVO.
|
| 87 |
+
|
| 88 |
+
<img width="600" alt="model_size_vs_f1_scatter_01" src="https://github.com/user-attachments/assets/fc9e6942-14f8-4f34-8229-74596b05c7c5" />
|
| 89 |
+
|
| 90 |
+
(Refer to Figure 1 and Table 4 in the paper for detailed metrics)
|
| 91 |
+
|
| 92 |
+
## ๐ Citation
|
| 93 |
+
|
| 94 |
+
If you use this model in your research, please cite our paper:
|
| 95 |
+
|
| 96 |
+
```Bibtex
|
| 97 |
+
@article{nie2025vulnllmr,
|
| 98 |
+
title={VulnLLM-R: Specialized Reasoning LLM with Agent Scaffold for Vulnerability Detection},
|
| 99 |
+
author={Nie, Yuzhou and Li, Hongwei and Guo, Chengquan and Jiang, Ruizhe and Wang, Zhun and Li, Bo and Song, Dawn and Guo, Wenbo},
|
| 100 |
+
journal={arXiv preprint arXiv:2512.07533},
|
| 101 |
+
year={2025}
|
| 102 |
+
}
|
| 103 |
+
```
|