VextLabsinc
/

pentest-7b

+---
+license: apache-2.0
+language:
+- en
+tags:
+- security
+- pentesting
+- cybersecurity
+- vulnerability-detection
+- red-team
+- bug-bounty
+- owasp
+- mitre-attack
+pipeline_tag: text-generation
+model-index:
+- name: vext-pentest-7b
+  results: []
+---
+# VEXT Pentest-7B -- The First Open-Source Security AI Model
+**Pentest-7B** is a 7-billion-parameter language model fine-tuned specifically for offensive security, penetration testing, and vulnerability analysis. Built by [VEXT Labs](https://tryvext.com) on top of [Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) and trained on **260,000+ curated security examples** drawn from real-world engagements, this is the first open-weight model purpose-built for the security profession.
+Pentest-7B runs on a single consumer GPU (16 GB VRAM), a MacBook with 16 GB RAM via Ollama, or CPU-only with quantized weights. No API keys, no cloud dependency, no data leaves your machine.
+## Key Capabilities
+| Capability | Description |
+|---|---|
+| **Vulnerability Explanation** | Given a CVE ID, CWE, or raw scan output, produce a clear technical explanation of the vulnerability, its root cause, and real-world impact. |
+| **Pentest Report Writing** | Generate executive summaries, technical finding write-ups, risk ratings, and remediation sections in standard pentest report format. |
+| **Attack Strategy Planning** | Given a target technology stack, suggest prioritized attack paths aligned with MITRE ATT&CK and OWASP Testing Guide methodologies. |
+| **Remediation Guidance** | Provide specific, actionable fix recommendations with code examples for common vulnerability classes. |
+| **Compliance Assessment** | Map findings to compliance frameworks (PCI DSS, SOC 2, HIPAA, ISO 27001) and articulate control gaps. |
+| **Threat Briefing** | Summarize threat intelligence, emerging CVEs, and APT campaign TTPs for stakeholder communication. |
+| **Security Code Review** | Analyze code snippets for injection flaws, authentication bypasses, insecure deserialization, and other OWASP Top 10 issues. |
+## Training
+### Data
+Pentest-7B was trained on **260,000+ curated examples** spanning:
+- **Production pentesting traces** -- Real (anonymized) action-observation pairs from VEXT's autonomous security agents running against authorized bug bounty targets. Includes successful exploitation chains, false positive patterns, and tool output interpretation.
+- **CTF challenge solutions** -- Structured walkthroughs from capture-the-flag competitions covering web, pwn, crypto, reverse engineering, and forensics categories.
+- **Bug bounty write-ups** -- Public responsible disclosure reports with structured vulnerability descriptions, reproduction steps, and impact assessments.
+- **MITRE ATT&CK corpus** -- Technique descriptions, procedure examples, detection guidance, and mitigation strategies across all 14 tactics.
+- **OWASP materials** -- Testing Guide procedures, ASVS requirements, cheat sheets, and vulnerability classifications.
+- **CVE analysis** -- Detailed analysis of 50,000+ CVEs including root cause, affected versions, exploit conditions, and patch diffs.
+- **DPO preference pairs** -- 2,000+ pairs where validated real findings are preferred over false positives, teaching the model to distinguish true vulnerabilities from noise.
+**What is NOT in the training data:** Raw exploit code, weaponized payloads, malware source, credentials, PII, or any data that could be directly used for unauthorized access. The model is trained to *reason about* security, not to serve as an exploit toolkit.
+### Architecture and Training Pipeline
+```
+Qwen2.5-7B-Instruct (base)
+    |
+    v
+QLoRA Fine-Tuning (SFT)
+  - Rank: 16, Alpha: 32
+  - Target modules: q_proj, k_proj, v_proj, o_proj
+  - 3 epochs, effective batch size 32
+  - Max sequence length: 4096 tokens
+  - Learning rate: 2e-4, cosine schedule
+    |
+    v
+DPO Alignment
+  - Beta: 0.1, sigmoid loss
+  - 1 epoch, learning rate 5e-6
+  - Preference signal: validated findings (chosen) vs false positives (rejected)
+    |
+    v
+Adapter Merge + AWQ 4-bit Quantization (optional)
+    |
+    v
+VEXT Pentest-7B
+```
+### Hardware
+- SFT: 8x NVIDIA A100 40GB (SageMaker ml.p4d.24xlarge), ~18 hours
+- DPO: 8x NVIDIA A100 40GB, ~4 hours
+- Quantization: Single A10G 24GB (SageMaker ml.g5.2xlarge)
+## Usage
+### Transformers (Full Precision)
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_id = "vext-labs/pentest-7b"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype="auto",
+    device_map="auto",
+)
+messages = [
+    {
+        "role": "system",
+        "content": (
+            "You are an expert penetration tester and security analyst. "
+            "Provide detailed, technically accurate security guidance."
+        ),
+    },
+    {
+        "role": "user",
+        "content": (
+            "I found a reflected XSS in a search parameter on an e-commerce site "
+            "during a bug bounty engagement. The input is reflected inside a "
+            "JavaScript string literal in the response. Write the finding for my "
+            "pentest report, including severity rating, impact, and remediation."
+        ),
+    },
+]
+text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+inputs = tokenizer(text, return_tensors="pt").to(model.device)
+outputs = model.generate(
+    **inputs,
+    max_new_tokens=1024,
+    temperature=0.7,
+    top_p=0.9,
+    repetition_penalty=1.1,
+)
+response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True)
+print(response)
+```
+### vLLM (Production Serving)
+```python
+from vllm import LLM, SamplingParams
+llm = LLM(
+    model="vext-labs/pentest-7b",
+    tensor_parallel_size=1,       # single GPU
+    max_model_len=4096,
+    gpu_memory_utilization=0.90,
+)
+sampling = SamplingParams(temperature=0.7, top_p=0.9, max_tokens=1024)
+prompts = [
+    "Explain CVE-2024-3094 (XZ Utils backdoor) — root cause, impact, and detection methods.",
+    "Given an exposed .git directory on a production web server, outline an attack plan.",
+]
+outputs = llm.generate(prompts, sampling)
+for output in outputs:
+    print(output.outputs[0].text)
+```
+**OpenAI-compatible API with vLLM:**
+```bash
+vllm serve vext-labs/pentest-7b --port 8000
+```
+```python
+from openai import OpenAI
+client = OpenAI(base_url="http://localhost:8000/v1", api_key="unused")
+response = client.chat.completions.create(
+    model="vext-labs/pentest-7b",
+    messages=[
+        {"role": "system", "content": "You are a senior penetration tester."},
+        {"role": "user", "content": "Analyze this Nmap output and suggest next steps:\n\nPORT    STATE SERVICE  VERSION\n22/tcp  open  ssh      OpenSSH 7.4\n80/tcp  open  http     Apache 2.4.6\n443/tcp open  ssl/http Apache 2.4.6\n3306/tcp open mysql    MySQL 5.7.38"},
+    ],
+    temperature=0.7,
+    max_tokens=1024,
+)
+print(response.choices[0].message.content)
+```
+### Ollama (Local, Quantized)
+```bash
+# Pull the model (GGUF Q4_K_M quantization, ~4.5 GB)
+ollama pull vext-labs/pentest-7b
+# Interactive chat
+ollama run vext-labs/pentest-7b
+# API
+curl http://localhost:11434/api/chat -d '{
+  "model": "vext-labs/pentest-7b",
+  "messages": [
+    {"role": "user", "content": "What are the top 5 things to check when auditing a JWT implementation?"}
+  ]
+}'
+```
+### Docker (Isolated Serving)
+```bash
+docker run --gpus all -p 8000:8000 \
+  ghcr.io/vext-labs/pentest-7b:latest \
+  --model vext-labs/pentest-7b --port 8000
+```
+## Telemetry
+Pentest-7B includes an **opt-in** telemetry collector to help us improve the model. It is **off by default** and collects only anonymized aggregate statistics (vulnerability categories, tool success rates, session metadata). It **never** collects URLs, IPs, credentials, vulnerability details, request/response bodies, file paths, or user identity.
+```bash
+# Enable (opt-in)
+export VEXT_TELEMETRY=on
+# Disable (default)
+export VEXT_TELEMETRY=off
+# See exactly what is collected
+python -c "from vext_telemetry import what_we_collect; what_we_collect()"
+```
+Full telemetry source code is included in the repository for audit: [`telemetry/collector.py`](telemetry/collector.py).
+## Evaluation
+| Benchmark | Pentest-7B | Qwen2.5-7B-Instruct (base) | GPT-4o (API) |
+|---|---|---|---|
+| SecBench (vuln classification) | **82.4%** | 61.2% | 79.8% |
+| CyberMetric (security knowledge) | **74.1%** | 52.7% | 71.3% |
+| PentestQA (methodology) | **88.6%** | 44.3% | 83.1% |
+| Finding Quality (human eval, 1-5) | **4.2** | 2.1 | 4.4 |
+| False Positive Rate | **12.3%** | 41.7% | 15.8% |
+*Benchmarks run with temperature=0, greedy decoding. Human evaluation by 3 senior pentesters on 200 randomly sampled findings.*
+## Intended Use
+This model is built for **authorized security professionals**:
+- Penetration testers writing reports and planning engagements
+- Bug bounty hunters analyzing targets and drafting submissions
+- Security engineers triaging vulnerabilities and planning remediation
+- SOC analysts interpreting alerts and assessing threat severity
+- Compliance teams mapping findings to regulatory frameworks
+- Security researchers studying vulnerability patterns
+## Limitations and Responsible Use
+- **Not a replacement for human expertise.** Always validate model outputs with manual testing and professional judgment.
+- **Authorization required.** Do not use this model's output to test systems without explicit written authorization from the system owner.
+- **No guarantee of accuracy.** The model can hallucinate CVE details, suggest inapplicable techniques, or miss critical context. Treat outputs as a starting point, not a final answer.
+- **Scope of training.** The model is strongest on web application security, network infrastructure, and common vulnerability classes. It has limited depth on hardware security, ICS/SCADA, mobile reversing, and cryptographic implementation review.
+- **Not an exploit generator.** The model is trained to reason about security concepts, not to produce weaponized code. Attempts to extract raw exploit payloads will produce lower-quality outputs by design.
+## License
+Apache 2.0. Use it, modify it, deploy it commercially. Attribution appreciated but not required.
+## Citation
+```bibtex
+@misc{vext-pentest-7b-2026,
+  title   = {VEXT Pentest-7B: An Open-Source Language Model for Penetration Testing and Security Analysis},
+  author  = {VEXT Labs},
+  year    = {2026},
+  url     = {https://huggingface.co/vext-labs/pentest-7b},
+  note    = {Fine-tuned from Qwen2.5-7B-Instruct on 260K+ curated security examples with QLoRA SFT and DPO alignment},
+}
+```
+## Links
+- **VEXT Platform:** [https://tryvext.com](https://tryvext.com)
+- **GitHub:** [https://github.com/vext-labs/pentest-7b](https://github.com/vext-labs/pentest-7b)
+- **Discord:** [https://discord.gg/vext-security](https://discord.gg/vext-security)
+- **Paper (coming soon):** Technical report with full training methodology and ablation studies
+## Built By
+[VEXT Labs, Inc.](https://tryvext.com) -- Building autonomous security testing infrastructure. Pentest-7B is the open-source foundation of our platform's security reasoning capabilities.
+---
+*If you use Pentest-7B in your research or product, we would love to hear about it. Open an issue or reach out at [oss@tryvext.com](mailto:oss@tryvext.com).*