File size: 5,285 Bytes
1ac2d5e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
---
license: apache-2.0
base_model: Qwen/Qwen2.5-Coder-7B-Instruct
tags:
  - security
  - code-review
  - vulnerability-detection
  - sast
  - false-positive-reduction
  - gguf
  - qwen2
  - ollama
language:
  - en
pipeline_tag: text-generation
model-index:
  - name: kon-security-v5
    results:
      - task:
          type: text-generation
          name: Security Code Review
        metrics:
          - name: Accuracy
            type: accuracy
            value: 98.1
          - name: F1 Score
            type: f1
            value: 0.99
          - name: False Positive Rate
            type: custom
            value: 0.0
          - name: JSON Compliance
            type: custom
            value: 100.0
---

# kon-security-v5

**Expert Security Code Reviewer** - A fine-tuned Qwen2.5-Coder-7B model specialized for security vulnerability detection and false positive reduction in SAST (Static Application Security Testing) pipelines.

## Model Details

| Property | Value |
|----------|-------|
| Base Model | [Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) |
| Fine-tuning | QLoRA (4-bit quantization) |
| Quantization | Q4_K_M (GGUF) |
| Parameters | 7.6B |
| Context Length | 32,768 tokens |
| File Size | ~4.7 GB |
| Format | GGUF (Ollama-compatible) |

## Performance

| Metric | Score |
|--------|-------|
| Overall Accuracy | **98.1%** |
| F1 Score | **0.99** |
| False Positive Rate | **0.0%** |
| JSON Compliance | **100%** |
| Avg Response Time | **2.8s** |

## Capabilities

- Identifies true security vulnerabilities across 20+ vulnerability categories
- Eliminates false positives from SAST tools (SQL injection, XSS, command injection, etc.)
- Provides structured JSON output with verdict, confidence, CWE IDs, severity, and remediation
- Understands framework-specific safe patterns (React, Django, Express, Rails, etc.)
- Supports taint analysis reasoning (source-to-sink tracking)

## Vulnerability Categories

CRITICAL: SQL Injection (CWE-89), Command Injection (CWE-78), Deserialization (CWE-502), Hardcoded Secrets (CWE-798), Code Injection (CWE-94)

HIGH: XSS (CWE-79), Path Traversal (CWE-22), SSRF (CWE-918), Timing Attacks (CWE-208), Buffer Overflow (CWE-120)

MEDIUM: Weak Crypto (CWE-327), Insecure Random (CWE-330), Information Disclosure (CWE-200), Missing Auth (CWE-306)

## Usage with Ollama

```bash
# Pull the model
ollama pull kon-security/kon-security-v5

# Or create from GGUF
ollama create kon-security-v5 -f Modelfile

# Run
ollama run kon-security-v5
```

### Example Prompt

```
<|im_start|>system
You are an expert security code reviewer...
<|im_end|>
<|im_start|>user
Analyze this code for SQL injection:
query = f"SELECT * FROM users WHERE id = {user_id}"
<|im_end|>
<|im_start|>assistant
```

### Example Response

```json
{
  "verdict": "TRUE_POSITIVE",
  "is_vulnerable": true,
  "confidence": 0.97,
  "cwe_ids": ["CWE-89"],
  "severity": "CRITICAL",
  "reasoning": "f-string interpolates user_id directly into SQL query without parameterization",
  "remediation": "cursor.execute('SELECT * FROM users WHERE id = ?', (user_id,))"
}
```

## System Prompt

The model is fine-tuned with the following system prompt baked in:

```
You are an expert security code reviewer specializing in identifying true
vulnerabilities and eliminating false positives. You analyze code with deep
understanding of security patterns across all languages and frameworks.

CRITICAL RULES:
1. Parameterized queries (?, $1, %s, :param) = SAFE from SQL injection
2. textContent, createTextNode = SAFE from XSS
3. React JSX {variable} = SAFE from XSS (React auto-escapes)
4. subprocess.run([list, args]) without shell=True = SAFE from command injection
5. json.loads/JSON.parse = SAFE (cannot execute code)
6. secure_filename() from werkzeug = SAFE from path traversal
7. bcrypt/argon2/scrypt for password hashing = SAFE
8. HMAC.compare_digest/timingSafeEqual = SAFE from timing attacks
9. DOMPurify.sanitize() = SAFE from XSS
10. MD5/SHA1 for non-security purposes (checksums, cache keys) = SAFE
11. Test files testing security scanners = SAFE
12. Environment variables for secrets = SAFE (not hardcoded)
13. ORM methods (Django .filter(), Rails .where(hash), SQLAlchemy) = SAFE from SQLi
14. Content-Security-Policy, helmet(), CORS allowlists = SAFE
```

## Integration with Kon Security Scanner

This model is the default LLM for the [Kon Security Scanner](https://github.com/kon-security/kon), providing:

- SAST finding validation and FP reduction
- CWE ID mapping
- Severity assessment
- Remediation suggestions

```python
from kon.core.ollama_analyzer import OllamaAnalyzer

analyzer = OllamaAnalyzer(model="kon-security-v5:latest")
result = analyzer.analyze_finding_enhanced(
    code_snippet="query = f'SELECT * FROM users WHERE id = {user_id}'",
    vulnerability_type="SQL Injection",
    file_path="app/db.py",
    line_number=42
)
print(result.verdict)  # TRUE_POSITIVE
```

## Training Details

- **Method**: QLoRA (4-bit quantization-aware fine-tuning)
- **Base**: Qwen2.5-Coder-7B-Instruct
- **Dataset**: Curated security code review examples covering 20+ CWE categories
- **Hardware**: NVIDIA GPU with CUDA support
- **Quantization**: Q4_K_M via llama.cpp

## License

Apache 2.0 (same as base model)