VextLabs commited on
Commit
a39fc37
·
verified ·
1 Parent(s): 8a711a2

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +285 -0
README.md ADDED
@@ -0,0 +1,285 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ tags:
6
+ - security
7
+ - pentesting
8
+ - cybersecurity
9
+ - vulnerability-detection
10
+ - red-team
11
+ - bug-bounty
12
+ - owasp
13
+ - mitre-attack
14
+ pipeline_tag: text-generation
15
+ model-index:
16
+ - name: vext-pentest-7b
17
+ results: []
18
+ ---
19
+
20
+ # VEXT Pentest-7B -- The First Open-Source Security AI Model
21
+
22
+ **Pentest-7B** is a 7-billion-parameter language model fine-tuned specifically for offensive security, penetration testing, and vulnerability analysis. Built by [VEXT Labs](https://tryvext.com) on top of [Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) and trained on **260,000+ curated security examples** drawn from real-world engagements, this is the first open-weight model purpose-built for the security profession.
23
+
24
+ Pentest-7B runs on a single consumer GPU (16 GB VRAM), a MacBook with 16 GB RAM via Ollama, or CPU-only with quantized weights. No API keys, no cloud dependency, no data leaves your machine.
25
+
26
+ ## Key Capabilities
27
+
28
+ | Capability | Description |
29
+ |---|---|
30
+ | **Vulnerability Explanation** | Given a CVE ID, CWE, or raw scan output, produce a clear technical explanation of the vulnerability, its root cause, and real-world impact. |
31
+ | **Pentest Report Writing** | Generate executive summaries, technical finding write-ups, risk ratings, and remediation sections in standard pentest report format. |
32
+ | **Attack Strategy Planning** | Given a target technology stack, suggest prioritized attack paths aligned with MITRE ATT&CK and OWASP Testing Guide methodologies. |
33
+ | **Remediation Guidance** | Provide specific, actionable fix recommendations with code examples for common vulnerability classes. |
34
+ | **Compliance Assessment** | Map findings to compliance frameworks (PCI DSS, SOC 2, HIPAA, ISO 27001) and articulate control gaps. |
35
+ | **Threat Briefing** | Summarize threat intelligence, emerging CVEs, and APT campaign TTPs for stakeholder communication. |
36
+ | **Security Code Review** | Analyze code snippets for injection flaws, authentication bypasses, insecure deserialization, and other OWASP Top 10 issues. |
37
+
38
+ ## Training
39
+
40
+ ### Data
41
+
42
+ Pentest-7B was trained on **260,000+ curated examples** spanning:
43
+
44
+ - **Production pentesting traces** -- Real (anonymized) action-observation pairs from VEXT's autonomous security agents running against authorized bug bounty targets. Includes successful exploitation chains, false positive patterns, and tool output interpretation.
45
+ - **CTF challenge solutions** -- Structured walkthroughs from capture-the-flag competitions covering web, pwn, crypto, reverse engineering, and forensics categories.
46
+ - **Bug bounty write-ups** -- Public responsible disclosure reports with structured vulnerability descriptions, reproduction steps, and impact assessments.
47
+ - **MITRE ATT&CK corpus** -- Technique descriptions, procedure examples, detection guidance, and mitigation strategies across all 14 tactics.
48
+ - **OWASP materials** -- Testing Guide procedures, ASVS requirements, cheat sheets, and vulnerability classifications.
49
+ - **CVE analysis** -- Detailed analysis of 50,000+ CVEs including root cause, affected versions, exploit conditions, and patch diffs.
50
+ - **DPO preference pairs** -- 2,000+ pairs where validated real findings are preferred over false positives, teaching the model to distinguish true vulnerabilities from noise.
51
+
52
+ **What is NOT in the training data:** Raw exploit code, weaponized payloads, malware source, credentials, PII, or any data that could be directly used for unauthorized access. The model is trained to *reason about* security, not to serve as an exploit toolkit.
53
+
54
+ ### Architecture and Training Pipeline
55
+
56
+ ```
57
+ Qwen2.5-7B-Instruct (base)
58
+ |
59
+ v
60
+ QLoRA Fine-Tuning (SFT)
61
+ - Rank: 16, Alpha: 32
62
+ - Target modules: q_proj, k_proj, v_proj, o_proj
63
+ - 3 epochs, effective batch size 32
64
+ - Max sequence length: 4096 tokens
65
+ - Learning rate: 2e-4, cosine schedule
66
+ |
67
+ v
68
+ DPO Alignment
69
+ - Beta: 0.1, sigmoid loss
70
+ - 1 epoch, learning rate 5e-6
71
+ - Preference signal: validated findings (chosen) vs false positives (rejected)
72
+ |
73
+ v
74
+ Adapter Merge + AWQ 4-bit Quantization (optional)
75
+ |
76
+ v
77
+ VEXT Pentest-7B
78
+ ```
79
+
80
+ ### Hardware
81
+
82
+ - SFT: 8x NVIDIA A100 40GB (SageMaker ml.p4d.24xlarge), ~18 hours
83
+ - DPO: 8x NVIDIA A100 40GB, ~4 hours
84
+ - Quantization: Single A10G 24GB (SageMaker ml.g5.2xlarge)
85
+
86
+ ## Usage
87
+
88
+ ### Transformers (Full Precision)
89
+
90
+ ```python
91
+ from transformers import AutoModelForCausalLM, AutoTokenizer
92
+
93
+ model_id = "vext-labs/pentest-7b"
94
+
95
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
96
+ model = AutoModelForCausalLM.from_pretrained(
97
+ model_id,
98
+ torch_dtype="auto",
99
+ device_map="auto",
100
+ )
101
+
102
+ messages = [
103
+ {
104
+ "role": "system",
105
+ "content": (
106
+ "You are an expert penetration tester and security analyst. "
107
+ "Provide detailed, technically accurate security guidance."
108
+ ),
109
+ },
110
+ {
111
+ "role": "user",
112
+ "content": (
113
+ "I found a reflected XSS in a search parameter on an e-commerce site "
114
+ "during a bug bounty engagement. The input is reflected inside a "
115
+ "JavaScript string literal in the response. Write the finding for my "
116
+ "pentest report, including severity rating, impact, and remediation."
117
+ ),
118
+ },
119
+ ]
120
+
121
+ text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
122
+ inputs = tokenizer(text, return_tensors="pt").to(model.device)
123
+
124
+ outputs = model.generate(
125
+ **inputs,
126
+ max_new_tokens=1024,
127
+ temperature=0.7,
128
+ top_p=0.9,
129
+ repetition_penalty=1.1,
130
+ )
131
+ response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True)
132
+ print(response)
133
+ ```
134
+
135
+ ### vLLM (Production Serving)
136
+
137
+ ```python
138
+ from vllm import LLM, SamplingParams
139
+
140
+ llm = LLM(
141
+ model="vext-labs/pentest-7b",
142
+ tensor_parallel_size=1, # single GPU
143
+ max_model_len=4096,
144
+ gpu_memory_utilization=0.90,
145
+ )
146
+
147
+ sampling = SamplingParams(temperature=0.7, top_p=0.9, max_tokens=1024)
148
+
149
+ prompts = [
150
+ "Explain CVE-2024-3094 (XZ Utils backdoor) — root cause, impact, and detection methods.",
151
+ "Given an exposed .git directory on a production web server, outline an attack plan.",
152
+ ]
153
+
154
+ outputs = llm.generate(prompts, sampling)
155
+ for output in outputs:
156
+ print(output.outputs[0].text)
157
+ ```
158
+
159
+ **OpenAI-compatible API with vLLM:**
160
+
161
+ ```bash
162
+ vllm serve vext-labs/pentest-7b --port 8000
163
+ ```
164
+
165
+ ```python
166
+ from openai import OpenAI
167
+
168
+ client = OpenAI(base_url="http://localhost:8000/v1", api_key="unused")
169
+
170
+ response = client.chat.completions.create(
171
+ model="vext-labs/pentest-7b",
172
+ messages=[
173
+ {"role": "system", "content": "You are a senior penetration tester."},
174
+ {"role": "user", "content": "Analyze this Nmap output and suggest next steps:\n\nPORT STATE SERVICE VERSION\n22/tcp open ssh OpenSSH 7.4\n80/tcp open http Apache 2.4.6\n443/tcp open ssl/http Apache 2.4.6\n3306/tcp open mysql MySQL 5.7.38"},
175
+ ],
176
+ temperature=0.7,
177
+ max_tokens=1024,
178
+ )
179
+ print(response.choices[0].message.content)
180
+ ```
181
+
182
+ ### Ollama (Local, Quantized)
183
+
184
+ ```bash
185
+ # Pull the model (GGUF Q4_K_M quantization, ~4.5 GB)
186
+ ollama pull vext-labs/pentest-7b
187
+
188
+ # Interactive chat
189
+ ollama run vext-labs/pentest-7b
190
+
191
+ # API
192
+ curl http://localhost:11434/api/chat -d '{
193
+ "model": "vext-labs/pentest-7b",
194
+ "messages": [
195
+ {"role": "user", "content": "What are the top 5 things to check when auditing a JWT implementation?"}
196
+ ]
197
+ }'
198
+ ```
199
+
200
+ ### Docker (Isolated Serving)
201
+
202
+ ```bash
203
+ docker run --gpus all -p 8000:8000 \
204
+ ghcr.io/vext-labs/pentest-7b:latest \
205
+ --model vext-labs/pentest-7b --port 8000
206
+ ```
207
+
208
+ ## Telemetry
209
+
210
+ Pentest-7B includes an **opt-in** telemetry collector to help us improve the model. It is **off by default** and collects only anonymized aggregate statistics (vulnerability categories, tool success rates, session metadata). It **never** collects URLs, IPs, credentials, vulnerability details, request/response bodies, file paths, or user identity.
211
+
212
+ ```bash
213
+ # Enable (opt-in)
214
+ export VEXT_TELEMETRY=on
215
+
216
+ # Disable (default)
217
+ export VEXT_TELEMETRY=off
218
+
219
+ # See exactly what is collected
220
+ python -c "from vext_telemetry import what_we_collect; what_we_collect()"
221
+ ```
222
+
223
+ Full telemetry source code is included in the repository for audit: [`telemetry/collector.py`](telemetry/collector.py).
224
+
225
+ ## Evaluation
226
+
227
+ | Benchmark | Pentest-7B | Qwen2.5-7B-Instruct (base) | GPT-4o (API) |
228
+ |---|---|---|---|
229
+ | SecBench (vuln classification) | **82.4%** | 61.2% | 79.8% |
230
+ | CyberMetric (security knowledge) | **74.1%** | 52.7% | 71.3% |
231
+ | PentestQA (methodology) | **88.6%** | 44.3% | 83.1% |
232
+ | Finding Quality (human eval, 1-5) | **4.2** | 2.1 | 4.4 |
233
+ | False Positive Rate | **12.3%** | 41.7% | 15.8% |
234
+
235
+ *Benchmarks run with temperature=0, greedy decoding. Human evaluation by 3 senior pentesters on 200 randomly sampled findings.*
236
+
237
+ ## Intended Use
238
+
239
+ This model is built for **authorized security professionals**:
240
+
241
+ - Penetration testers writing reports and planning engagements
242
+ - Bug bounty hunters analyzing targets and drafting submissions
243
+ - Security engineers triaging vulnerabilities and planning remediation
244
+ - SOC analysts interpreting alerts and assessing threat severity
245
+ - Compliance teams mapping findings to regulatory frameworks
246
+ - Security researchers studying vulnerability patterns
247
+
248
+ ## Limitations and Responsible Use
249
+
250
+ - **Not a replacement for human expertise.** Always validate model outputs with manual testing and professional judgment.
251
+ - **Authorization required.** Do not use this model's output to test systems without explicit written authorization from the system owner.
252
+ - **No guarantee of accuracy.** The model can hallucinate CVE details, suggest inapplicable techniques, or miss critical context. Treat outputs as a starting point, not a final answer.
253
+ - **Scope of training.** The model is strongest on web application security, network infrastructure, and common vulnerability classes. It has limited depth on hardware security, ICS/SCADA, mobile reversing, and cryptographic implementation review.
254
+ - **Not an exploit generator.** The model is trained to reason about security concepts, not to produce weaponized code. Attempts to extract raw exploit payloads will produce lower-quality outputs by design.
255
+
256
+ ## License
257
+
258
+ Apache 2.0. Use it, modify it, deploy it commercially. Attribution appreciated but not required.
259
+
260
+ ## Citation
261
+
262
+ ```bibtex
263
+ @misc{vext-pentest-7b-2026,
264
+ title = {VEXT Pentest-7B: An Open-Source Language Model for Penetration Testing and Security Analysis},
265
+ author = {VEXT Labs},
266
+ year = {2026},
267
+ url = {https://huggingface.co/vext-labs/pentest-7b},
268
+ note = {Fine-tuned from Qwen2.5-7B-Instruct on 260K+ curated security examples with QLoRA SFT and DPO alignment},
269
+ }
270
+ ```
271
+
272
+ ## Links
273
+
274
+ - **VEXT Platform:** [https://tryvext.com](https://tryvext.com)
275
+ - **GitHub:** [https://github.com/vext-labs/pentest-7b](https://github.com/vext-labs/pentest-7b)
276
+ - **Discord:** [https://discord.gg/vext-security](https://discord.gg/vext-security)
277
+ - **Paper (coming soon):** Technical report with full training methodology and ablation studies
278
+
279
+ ## Built By
280
+
281
+ [VEXT Labs, Inc.](https://tryvext.com) -- Building autonomous security testing infrastructure. Pentest-7B is the open-source foundation of our platform's security reasoning capabilities.
282
+
283
+ ---
284
+
285
+ *If you use Pentest-7B in your research or product, we would love to hear about it. Open an issue or reach out at [oss@tryvext.com](mailto:oss@tryvext.com).*