File size: 5,792 Bytes
a9868ad
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
---
license: apache-2.0
base_model: Qwen/Qwen2.5-Coder-7B-Instruct
library_name: peft
pipeline_tag: text-generation
tags:
  - code
  - security
  - cybersecurity
  - vulnerability-detection
  - application-security
  - ai-generated-code
  - qlora
  - peft
  - qwen2.5-coder
---

# Nullsec-S1

Nullsec-S1 is an open-source security model purpose-built to audit AI-generated apps, agents, and vibecoded software before they reach production.

This repository contains the **RC2/v1.1 PEFT / QLoRA adapter** for `Qwen/Qwen2.5-Coder-7B-Instruct`. It is an adapter release, not merged full model weights. Users need the base model plus this adapter.

## Release

- Model name: Nullsec-S1
- Release: RC2/v1.1
- GitHub release tag: [`v1.0.0-rc25`](https://github.com/trynullsec/nullsec-s1/releases/tag/v1.0.0-rc25)
- Release artifact commit: `c29c7f1`
- Base model: `Qwen/Qwen2.5-Coder-7B-Instruct`
- Adapter type: PEFT / QLoRA
- Adapter weights: `adapter_model.safetensors`
- Tokenizer/chat template: included with this adapter repository

## What it is

Nullsec-S1 returns final structured JSON security audit verdicts for application code, AI-generated apps, autonomous agents, MCP tools, Web3/wallet flows, and common application-security failures.

`S1` means `Security-1`. Nullsec-S1 does **not** expose a hidden reasoning-token loop, `<thought>` format, or chain-of-thought parser. It emits a final structured security audit.

## Intended use

- Auditing AI-generated applications before deployment
- Reviewing autonomous-agent and MCP tool risk
- Reviewing Web3/wallet approval and transaction flows
- Generating structured security verdicts for CI, API, or CLI integrations
- Producing secure patch guidance for detected findings

## Out of scope

- Not a general chatbot
- Not trained from scratch
- Not a replacement for human security review
- Not a guarantee of zero vulnerabilities
- Not a universal production-safety guarantee
- No "first", "only", or "best" claim is made

## How to load with Transformers + PEFT

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel

base_model = "Qwen/Qwen2.5-Coder-7B-Instruct"
adapter_id = "trynullsec/nullsec-s1"

quant = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
)

tokenizer = AutoTokenizer.from_pretrained(adapter_id, trust_remote_code=True)
base = AutoModelForCausalLM.from_pretrained(
    base_model,
    quantization_config=quant,
    device_map="auto",
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(base, adapter_id)
model.eval()
```

## Prompt format

Use the tokenizer chat template. The recommended user message is:

````text
Audit the following code for security vulnerabilities. Emit only the JSON verdict.

FILE: app/api/admin/route.ts
```typescript
<code here>
```
````

Use a system instruction equivalent to:

```text
You are Nullsec-1, a strict security review model. You are NOT a chatbot and you do not write features. Your only job is to audit code for security risk and emit a single JSON verdict.
```

## Output schema

Nullsec-S1 is trained to emit a single JSON object with:

- `risk_score`
- `production_ready`
- `severity`
- `confidence`
- `reasoning_summary`
- `exploit_scenario`
- `affected_files`
- `checks_performed`
- `findings`

Safe code should return an empty `findings` array:

```json
{
  "risk_score": 0,
  "production_ready": true,
  "severity": "INFO",
  "findings": []
}
```

Unsafe code should include one finding per independent issue. Downstream systems should still run deterministic schema alignment and safety enforcement over the raw model output.

## Evaluation results

On the Nullsec RC2/v1.1 111-case security benchmark:

| Metric | Result |
|---|---:|
| raw outputs | 111/111 |
| detection F1 | 0.9245 |
| precision | 0.9423 |
| recall | 0.9074 |
| false_safe_rate | 0.0 |
| safety probes | passed |

These results are benchmark-scoped and tied to the [`v1.0.0-rc25`](https://github.com/trynullsec/nullsec-s1/releases/tag/v1.0.0-rc25) release artifacts.

## Baseline comparison

On the same Nullsec RC2/v1.1 benchmark:

| System / tool | F1 |
|---|---:|
| Nullsec-S1 RC2/v1.1 | 0.9245 |
| OpenAI/Codex `gpt-5.3-codex` | 0.7252 |
| Claude Opus 4.8 | 0.6550 |
| Semgrep local rules | 0.5535 |
| Qwen2.5-Coder-7B-Instruct base | 0.0180 |

Baseline results are produced by project scripts and should be reproduced from the repository for comparison. They are not universal claims about any provider or tool.

## Limitations

- The benchmark is repo-authored and security-specific.
- Benchmark performance does not guarantee every vulnerability will be detected in arbitrary real-world code.
- Independent security review is recommended for critical systems.
- Patch correctness is structurally measured; compile/run/test verification is future work.
- Hosted-provider baselines can change over time as provider models change.
- This adapter is not merged full weights; users must load the base model.

## Safety and non-claims

Nullsec-S1's `production_ready` field is advisory until deterministic safety enforcement is applied. In the Nullsec repository, the Security Alignment Layer and Safety Layer recompute and enforce production readiness.

This release does **not** claim:

- first, only, or best model status
- guaranteed secure code
- zero vulnerabilities
- replacement for human security review
- universal production safety

## Provenance

- GitHub repo: https://github.com/trynullsec/nullsec-s1
- GitHub release: https://github.com/trynullsec/nullsec-s1/releases/tag/v1.0.0-rc25
- Base model: https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct