Text Generation
Transformers
Safetensors
English
llama-3.2-1B-Instruct
File size: 3,432 Bytes
c5aa321
 
 
 
 
 
 
 
a386d63
e04979e
c5aa321
 
 
514849b
 
c5aa321
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a386d63
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
---
license: apache-2.0
language:
- en
library_name: transformers
pipeline_tag: text-generation
datasets:
- custom
- CyberNative/Code_Vulnerability_Security_DPO
- doss1232/vulnerable-code
metrics:
- ROUGE-L F1
- BLEU
tags:
- llama-3.2-1B-Instruct
---

# Model Card for `merged-vuln-detector`

## Model Details

- **Base Model:** `llama-3.2-1B-Instruct`
- **Fine-tuned Model:** `merged-vuln-detector`
- **Model Type:** Causal Language Model fine-tuned for vulnerability detection in code.

## Model Description

This model is a fine-tuned version of `llama-3.2-1B-Instruct` on a dataset of code snippets and their corresponding vulnerability analyses. The model is intended to be used as a security expert that can analyze code and identify potential vulnerabilities.

## Training Data

The model was fine-tuned on the `CyberNative/Code_Vulnerability_Security_DPO` dataset, which can be found on Hugging Face at https://huggingface.co/datasets/CyberNative/Code_Vulnerability_Security_DPO.

The data is formatted as follows, where the model is prompted to analyze the security of a given code snippet:

```
Analyze the security vulnerabilities in the following code.

[CODE SNIPPET]

Analysis:
[VULNERABILITY DESCRIPTION]
```

## Training Procedure

The model was fine-tuned using QLoRA on a single GPU. The training script uses the `trl` library's `SFTTrainer`.

### Hyperparameters:
- **Quantization:** 4-bit (`nf4`)
- **LoRA `r`:** 16
- **LoRA `alpha`:** 32
- **LoRA `dropout`:** 0.1
- **Target Modules:** `q_proj`, `k_proj`, `v_proj`, `o_proj`
- **Batch Size:** 1 (with gradient accumulation steps of 8)
- **Optimizer:** `paged_adamw_8bit`
- **Precision:** `fp16`
- **Max Steps:** 240
- **Learning Rate:** `2e-4`
- **Max Sequence Length:** 1024

## Evaluation Results

The model was evaluated on the `doss1232/vulnerable-code` dataset against the base model. The results are as follows:

| Model                   | ROUGE-L F1 | BLEU   |
|-------------------------|------------|--------|
| `llama-3.2-1B-Instruct` | 0.0933     | 0.0061 |
| `merged-vuln-detector`  | 0.1335     | 0.0219 |

## How to use

```python
from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "merged-vuln-detector"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

code = """
#include <cstring>

void copyString(char* dest, const char* src) {
    while (*src != '\0') {
        *dest = *src;
        dest++;
        src++;
    }
}

int main() {
    char source[10] = "Hello!";
    char destination[5];
    copyString(destination, source);
    return 0;
}
"""

prompt = f"Analyze the security vulnerabilities in the following code.\n\n{code}\n\nAnalysis:\n"

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=512)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

### Example Output

**Input Code:**
```c
#include <cstring>

void copyString(char* dest, const char* src) {
    while (*src != '\0') {
        *dest = *src;
        dest++;
        src++;
    }
}

int main() {
    char source[10] = "Hello!";
    char destination[5];
    copyString(destination, source);
    return 0;
}
```

**Model Output:**
> The code has a buffer overflow vulnerability due to the lack of bounds checking on the destination buffer size.

## Model Card Authors

[Seokhee Chang]

## Model Card Contact

[cycloevan97@gmail.com]