Qwen2.5-Coder-7B-Instruct-Nuclei

Model Details

Model Description

This model is a fine-tuned version of Qwen2.5-Coder-7B-Instruct, specialized in generating Nuclei YAML templates for vulnerability detection. It was created as part of a bachelor’s thesis on automating Nuclei template creation using large language models.

The model receives a structured JSON description of a vulnerability (CVE ID, affected product, protocol details, exploitation summary, detection logic, etc.) and outputs a fully valid Nuclei template ready for use with the Nuclei scanner. It has been trained to follow the exact syntax of Nuclei templates, including proper HTTP request definitions, matchers, extractors, and OAST (Interactsh) integration.

Fine-tuned by: NormanRey;
Model type: Causal language model fine-tuned with instruction tuning;
Language(s): English (prompts), YAML (output);
License: Apache 2.0;
Finetuned from: Qwen/Qwen2.5-Coder-7B-Instruct

Also check out NormanRey/Qwen2.5-Coder-3B-Instruct-Nuclei.

Uses

Direct Use

The model is intended to be used directly for generating Nuclei templates from vulnerability descriptions. It can be integrated into security automation pipelines, threat intelligence platforms, or used by penetration testers to quickly create detection rules for newly disclosed CVEs.

Downstream Use

When further fine-tuned on additional vulnerability classes (e.g., DNS, TCP, JavaScript), the model could cover the full Nuclei template ecosystem. The adapter weights can be merged with newer versions of Qwen-Coder for continuous improvement.

Out-of-Scope Use

The model should not be used for:

Generating offensive payloads or exploit code;
Any malicious activity not related to authorized security testing;
Producing templates for CVEs that require complex multi-step authentication flows without proper validation.

Bias, Risks, and Limitations

The model was trained on a dataset of 2350 unique HTTP-based Nuclei templates. As a result, it may:

Perform poorly on non-HTTP protocols (DNS, TCP, etc.);
Occasionally produce syntactically correct but logically flawed matchers;
Generate templates that require minor manual corrections (e.g., quoting strings with colons).

Recommendations

Always validate generated templates with nuclei -validate and, where possible, test them against known vulnerable instances. The model is a powerful assistant but not a replacement for human review.

Example of output

Generated template for CVE-2023-43654:

id: CVE-2023-43654
info:
  name: PyTorch TorchServe - Server-Side Request Forgery (SSRF)
  author: nuclei-generator
  severity: high
  description: 'PyTorch TorchServe versions from 0.1.0 to 0.8.1 are vulnerable to server-side
    request forgery (SSRF) via unrestricted model URL loading.

    '
  impact: 'Unauthenticated attackers can exploit SSRF vulnerabilities in PyTorch
    TorchServe to access internal resources, potentially leading to data leakage,
    unauthorized access, or remote code execution.

    '
  remediation: 'Upgrade to PyTorch TorchServe version 0.8.2 or later that restricts
    allowed URLs and validates model definitions before downloading them.

    '
  reference:
  - https://huntr.com/bounties/1695374e-6c7a-4b3d-bf96-000000000000/
  - https://nvd.nist.gov/vuln/detail/CVE-2023-43654
  classification:
    cvss-metrics: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:N/A:N
    cvss-score: 8.6
    cve-id: CVE-2023-43654
    epss-score: 0.94743
    epss-percentile: 0.99919
    cwe-id: CWE-918
  metadata:
    verified: true
    max-request: 1
    shodan-query: http.title:"TorchServe"
  tags: cve,cve2023,pytorch,torchserve,oast,ssrf,vkev,vuln
http:
- raw:
  - 'POST /models?url=http://{{interactsh-url}} HTTP/1.1

    Host: {{Hostname}}

    Content-Type: application/x-www-form-urlencoded


    '
  matchers-condition: and
  matchers:
  - type: word
    part: interactsh_protocol
    words:
    - http
  - type: word
    part: interactsh_request
    words:
    - "User-Agent: Java"
  - type: word
    part: header
    name: content-type
    words:
    - application/json

How to Get Started with the Model

Using with Hugging Face Transformers

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("NormanRey/Qwen2.5-Coder-7B-Instruct-Nuclei")
model = AutoModelForCausalLM.from_pretrained("NormanRey/Qwen2.5-Coder-7B-Instruct-Nuclei")

prompt = "### System:\nYou are a precise Nuclei YAML template generator...\n\n### Instruction:\nGenerate a complete Nuclei template...\n\n### Input:\n{...vulnerability JSON...}\n\n### Response:\n"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=7300)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Full system prompt:

You are a Nuclei template generator.
Return only valid YAML for a single Nuclei template.

MANDATORY RULES (MUST FOLLOW)
   - The path MUST be taken EXACTLY from request_shape.primary_path.
   - DO NOT modify, generalize, or replace paths (no variables like {{BaseURL}}/api/ except the original one).
   - DO NOT introduce additional endpoints not present in input.
   - If auth_required is false → DO NOT add any login, session, or credential requests.
   - Do not inject cookies or headers that imply authentication unless explicitly specified.
   - The number of HTTP requests MUST match template_context.flow (e.g., "single-step" → one request).
   - Prefer copying the provided description verbatim instead of rewriting.
   - Keep all metadata (name, severity, classification, etc.) under the "info" block.
   - Do not omit existing query parameters described in exploitation_summary or request_shape.
   - Output YAML only – no explanations, no markdown, no extra text.
    - You MAY use matchers of type "word" or "status" for simple checks (e.g., single word, fixed status code).
    - For complex conditions (OR / AND across multiple indicators, combined checks on status + body + header), you MUST use a SINGLE matcher of type "dsl".
        - Put all required expressions inside the "dsl" array.
        - Use "condition: and" or "condition: or" as needed.
    - Examples of DSL expressions:
        - status_code == 200
        - contains(body, "secret")
        - contains_all(body, "admin", "config")
        - contains(to_lower(body), "error")
        - len(body) > 100
    - Keep matchers deterministic – avoid regex if a simple word/contains works.
    - Use the exact Nuclei schema.
    - Do not invent fields.
    - "http" must be a top-level block.
    - Use "raw" requests only if needed for exploit templates; otherwise prefer "http" with method + path.

Generate the YAML template based on the instruction and input below.
Do not deviate from the rules above.

Using with Ollama (Local Inference)

You can convert this model to GGUF format for Ollama and then use it.

Pull and run:

ollama create qwen-7b-nuclei -f Modelfile
ollama run qwen-7b-nuclei

Training Details

Training Data

The model was trained on the NormanRey/nuclei-template-generation-dataset-2.3K, a custom dataset created from the official Nuclei Templates repository.

It contains 2350 training examples, each consisting of:

instruction: A fixed prompt instructing the model to generate a Nuclei template;
input: A JSON object describing a specific vulnerability (CVE ID, protocol, path, exploitation details, etc.), generated by a summarization LLM from the reference links of the original template;
output: The corresponding valid Nuclei YAML template.

Dataset Construction Pipeline

The dataset was constructed via an automated pipeline that:

Parsed approximately 6360 HTTP-based Nuclei templates from five categories:
- CVEs;
- Exposures;
- Misconfigurations;
- Miscellaneous;
- Vulnerabilities.
Fetched and cleaned the textual content of external references:
- Security advisories;
- Exploit-DB pages;
- NVD entries.
Used an LLM summarizer to convert the collected information and original YAML templates into a normalized JSON vulnerability specification.
Applied an LLM-as-a-Judge filtering stage, retaining only examples with an average score of 7.0 or higher for:
- Technical completeness;
- Alignment with the original template;
- Reconstruction sufficiency.
Deduplicated the resulting dataset and split it into:
- Train: 80%;
- Validation: 10%;
- Test: 10%.

Training Procedure

The model was fine-tuned using Weight-Decomposed Low-Rank Adaptation (DoRA) without 4-bit quantization (full bfloat16 precision) on an NVIDIA A100 80 GB GPU.

Preprocessing

Each example was formatted into a chat-style structure:

### System:
{system_prompt}

### Instruction:
{instruction}

### Input:
{input_json}

### Response:
{output_yaml}

Training Hyperparameters

Parameter	Value
Training regime	bfloat16 mixed precision
LoRA rank (r)	64
LoRA alpha	128
LoRA dropout	0.05
Target modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Effective batch size	32 (1 GPU × 32 gradient accumulation steps)
Learning rate	1e-4
LR scheduler	Cosine with 30 warmup steps
Optimizer	paged_adamw_8bit
Max sequence length	7300 tokens
Epochs	3
Seed	42

Speeds, Sizes, and Resource Usage

Metric	Value
Training duration	1.5 hours
Adapter size (LoRA)	310 MB
GPU memory allocated	15.57 GB
GPU memory reserved	16.01 GB

Evaluation

Testing Data

Evaluation was conducted on a held-out test set of 25 CVEs that were not included during training.

Each test sample was paired with its corresponding original template from the official Nuclei repository.

Evaluation Metrics

Three custom metrics were scored by an LLM-as-a-Judge (GPT-5.4-mini) on a scale from 0 to 10:

Structural Integrity

Measures:

YAML syntactic correctness;
Presence of mandatory sections;
Overall template structure.

Detection Fidelity

Measures:

Accuracy of generated HTTP requests;
Correctness of matchers;
OAST interaction logic;
Similarity to the original template detection workflow.

Metadata Completeness

Measures:

CVE identifiers;
Severity information;
References;
Tags and auxiliary metadata.

Additional Validation

All generated templates were validated using:

nuclei -validate

with pass/fail reporting.

Results

Model	Structural Integrity	Detection Fidelity	Metadata Completeness	Nuclei Validation Rate
Qwen2.5-Coder-7B-Instruct-Nuclei	6.08	2.56	4.84	96% (24/25)
Qwen2.5-Coder-3B-Instruct-Nuclei	3.92	1.56	3.20	16% (4/25)
ChatGPT-5.4	5.76	2.12	4.24	64% (16/25)
Qwen2.5-Coder-7B-Instruct	1.92	1.04	2.60	0% (0/25)
Qwen2.5-Coder-3B-Instruct	2.20	0.96	1.80	0% (0/25)

The fine-tuned model significantly outperforms its base version and compares favorably with commercial general-purpose models, particularly in:

Detection fidelity;
Template validity;
Nuclei validation success rate.

Summary

The model effectively automates the generation of syntactically valid and functionally accurate Nuclei templates from structured vulnerability descriptions.

It can serve as a practical assistant for:

Security researchers;
Penetration testers;
Vulnerability analysts;
Detection engineers.

Environmental Impact

Metric	Value
Hardware	NVIDIA A100 (40x2) GB
Usage Time	1.5 hours
Cloud Provider	Google Colab

Technical Specifications

Model Architecture and Objective

The model uses the same architecture as Qwen2.5-Coder-7B-Instruct, a decoder-only Transformer.

Training objective:

Causal Language Modeling (CLM);
Loss computed exclusively on the response tokens.

Compute Infrastructure

Hardware

Training

1 × NVIDIA A100 80 GB GPU
40 GB system RAM

Inference

The model can run on:

On a GPU with 8-16 GB: using 4-bit GGUF quantization;
On a GPU with more than 16 GB: just convert to GGUF without quantization.

Software

Component	Version
Transformers	5.0.0
PEFT	0.19.1
TRL	1.5.0
PyTorch	2.10.0+cu128
Datasets	4.0.0

Downloads last month: 38

Safetensors

Model size

8B params

Tensor type

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NormanRey/Qwen2.5-Coder-7B-Instruct-Nuclei

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-Coder-7B

Finetuned

Qwen/Qwen2.5-Coder-7B-Instruct

Adapter

(684)

this model

Evaluation results

structural_integrity (LLM-as-a-Judge) on nuclei-template-test-set
self-reported

6.080
detection_fidelity (LLM-as-a-Judge) on nuclei-template-test-set
self-reported

2.560
metadata_completeness (LLM-as-a-Judge) on nuclei-template-test-set
self-reported

4.840
nuclei_validation_rate on nuclei-template-test-set
self-reported

0.960