Instructions to use NormanRey/Qwen2.5-Coder-7B-Instruct-Nuclei with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use NormanRey/Qwen2.5-Coder-7B-Instruct-Nuclei with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
Qwen2.5-Coder-7B-Instruct-Nuclei
Model Details
Model Description
This model is a fine-tuned version of Qwen2.5-Coder-7B-Instruct, specialized in generating Nuclei YAML templates for vulnerability detection. It was created as part of a bachelor’s thesis on automating Nuclei template creation using large language models.
The model receives a structured JSON description of a vulnerability (CVE ID, affected product, protocol details, exploitation summary, detection logic, etc.) and outputs a fully valid Nuclei template ready for use with the Nuclei scanner. It has been trained to follow the exact syntax of Nuclei templates, including proper HTTP request definitions, matchers, extractors, and OAST (Interactsh) integration.
- Fine-tuned by: NormanRey;
- Model type: Causal language model fine-tuned with instruction tuning;
- Language(s): English (prompts), YAML (output);
- License: Apache 2.0;
- Finetuned from: Qwen/Qwen2.5-Coder-7B-Instruct
Also check out NormanRey/Qwen2.5-Coder-3B-Instruct-Nuclei.
Uses
Direct Use
The model is intended to be used directly for generating Nuclei templates from vulnerability descriptions. It can be integrated into security automation pipelines, threat intelligence platforms, or used by penetration testers to quickly create detection rules for newly disclosed CVEs.
Downstream Use
When further fine-tuned on additional vulnerability classes (e.g., DNS, TCP, JavaScript), the model could cover the full Nuclei template ecosystem. The adapter weights can be merged with newer versions of Qwen-Coder for continuous improvement.
Out-of-Scope Use
The model should not be used for:
- Generating offensive payloads or exploit code;
- Any malicious activity not related to authorized security testing;
- Producing templates for CVEs that require complex multi-step authentication flows without proper validation.
Bias, Risks, and Limitations
The model was trained on a dataset of 2350 unique HTTP-based Nuclei templates. As a result, it may:
- Perform poorly on non-HTTP protocols (DNS, TCP, etc.);
- Occasionally produce syntactically correct but logically flawed matchers;
- Generate templates that require minor manual corrections (e.g., quoting strings with colons).
Recommendations
Always validate generated templates with nuclei -validate and, where possible, test them against known vulnerable instances. The model is a powerful assistant but not a replacement for human review.
Example of output
Generated template for CVE-2023-43654:
id: CVE-2023-43654
info:
name: PyTorch TorchServe - Server-Side Request Forgery (SSRF)
author: nuclei-generator
severity: high
description: 'PyTorch TorchServe versions from 0.1.0 to 0.8.1 are vulnerable to server-side
request forgery (SSRF) via unrestricted model URL loading.
'
impact: 'Unauthenticated attackers can exploit SSRF vulnerabilities in PyTorch
TorchServe to access internal resources, potentially leading to data leakage,
unauthorized access, or remote code execution.
'
remediation: 'Upgrade to PyTorch TorchServe version 0.8.2 or later that restricts
allowed URLs and validates model definitions before downloading them.
'
reference:
- https://huntr.com/bounties/1695374e-6c7a-4b3d-bf96-000000000000/
- https://nvd.nist.gov/vuln/detail/CVE-2023-43654
classification:
cvss-metrics: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:N/A:N
cvss-score: 8.6
cve-id: CVE-2023-43654
epss-score: 0.94743
epss-percentile: 0.99919
cwe-id: CWE-918
metadata:
verified: true
max-request: 1
shodan-query: http.title:"TorchServe"
tags: cve,cve2023,pytorch,torchserve,oast,ssrf,vkev,vuln
http:
- raw:
- 'POST /models?url=http://{{interactsh-url}} HTTP/1.1
Host: {{Hostname}}
Content-Type: application/x-www-form-urlencoded
'
matchers-condition: and
matchers:
- type: word
part: interactsh_protocol
words:
- http
- type: word
part: interactsh_request
words:
- "User-Agent: Java"
- type: word
part: header
name: content-type
words:
- application/json
How to Get Started with the Model
Using with Hugging Face Transformers
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("NormanRey/Qwen2.5-Coder-7B-Instruct-Nuclei")
model = AutoModelForCausalLM.from_pretrained("NormanRey/Qwen2.5-Coder-7B-Instruct-Nuclei")
prompt = "### System:\nYou are a precise Nuclei YAML template generator...\n\n### Instruction:\nGenerate a complete Nuclei template...\n\n### Input:\n{...vulnerability JSON...}\n\n### Response:\n"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=7300)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Full system prompt:
You are a Nuclei template generator.
Return only valid YAML for a single Nuclei template.
MANDATORY RULES (MUST FOLLOW)
- The path MUST be taken EXACTLY from request_shape.primary_path.
- DO NOT modify, generalize, or replace paths (no variables like {{BaseURL}}/api/ except the original one).
- DO NOT introduce additional endpoints not present in input.
- If auth_required is false → DO NOT add any login, session, or credential requests.
- Do not inject cookies or headers that imply authentication unless explicitly specified.
- The number of HTTP requests MUST match template_context.flow (e.g., "single-step" → one request).
- Prefer copying the provided description verbatim instead of rewriting.
- Keep all metadata (name, severity, classification, etc.) under the "info" block.
- Do not omit existing query parameters described in exploitation_summary or request_shape.
- Output YAML only – no explanations, no markdown, no extra text.
- You MAY use matchers of type "word" or "status" for simple checks (e.g., single word, fixed status code).
- For complex conditions (OR / AND across multiple indicators, combined checks on status + body + header), you MUST use a SINGLE matcher of type "dsl".
- Put all required expressions inside the "dsl" array.
- Use "condition: and" or "condition: or" as needed.
- Examples of DSL expressions:
- status_code == 200
- contains(body, "secret")
- contains_all(body, "admin", "config")
- contains(to_lower(body), "error")
- len(body) > 100
- Keep matchers deterministic – avoid regex if a simple word/contains works.
- Use the exact Nuclei schema.
- Do not invent fields.
- "http" must be a top-level block.
- Use "raw" requests only if needed for exploit templates; otherwise prefer "http" with method + path.
Generate the YAML template based on the instruction and input below.
Do not deviate from the rules above.
Using with Ollama (Local Inference)
You can convert this model to GGUF format for Ollama and then use it.
Pull and run:
ollama create qwen-7b-nuclei -f Modelfile
ollama run qwen-7b-nuclei
Training Details
Training Data
The model was trained on the NormanRey/nuclei-template-generation-dataset-2.3K, a custom dataset created from the official Nuclei Templates repository.
It contains 2350 training examples, each consisting of:
- instruction: A fixed prompt instructing the model to generate a Nuclei template;
- input: A JSON object describing a specific vulnerability (CVE ID, protocol, path, exploitation details, etc.), generated by a summarization LLM from the reference links of the original template;
- output: The corresponding valid Nuclei YAML template.
Dataset Construction Pipeline
The dataset was constructed via an automated pipeline that:
Parsed approximately 6360 HTTP-based Nuclei templates from five categories:
- CVEs;
- Exposures;
- Misconfigurations;
- Miscellaneous;
- Vulnerabilities.
Fetched and cleaned the textual content of external references:
- Security advisories;
- Exploit-DB pages;
- NVD entries.
Used an LLM summarizer to convert the collected information and original YAML templates into a normalized JSON vulnerability specification.
Applied an LLM-as-a-Judge filtering stage, retaining only examples with an average score of 7.0 or higher for:
- Technical completeness;
- Alignment with the original template;
- Reconstruction sufficiency.
Deduplicated the resulting dataset and split it into:
- Train: 80%;
- Validation: 10%;
- Test: 10%.
Training Procedure
The model was fine-tuned using Weight-Decomposed Low-Rank Adaptation (DoRA) without 4-bit quantization (full bfloat16 precision) on an NVIDIA A100 80 GB GPU.
Preprocessing
Each example was formatted into a chat-style structure:
### System:
{system_prompt}
### Instruction:
{instruction}
### Input:
{input_json}
### Response:
{output_yaml}
Training Hyperparameters
| Parameter | Value |
|---|---|
| Training regime | bfloat16 mixed precision |
| LoRA rank (r) | 64 |
| LoRA alpha | 128 |
| LoRA dropout | 0.05 |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Effective batch size | 32 (1 GPU × 32 gradient accumulation steps) |
| Learning rate | 1e-4 |
| LR scheduler | Cosine with 30 warmup steps |
| Optimizer | paged_adamw_8bit |
| Max sequence length | 7300 tokens |
| Epochs | 3 |
| Seed | 42 |
Speeds, Sizes, and Resource Usage
| Metric | Value |
|---|---|
| Training duration | 1.5 hours |
| Adapter size (LoRA) | 310 MB |
| GPU memory allocated | 15.57 GB |
| GPU memory reserved | 16.01 GB |
Evaluation
Testing Data
Evaluation was conducted on a held-out test set of 25 CVEs that were not included during training.
Each test sample was paired with its corresponding original template from the official Nuclei repository.
Evaluation Metrics
Three custom metrics were scored by an LLM-as-a-Judge (GPT-5.4-mini) on a scale from 0 to 10:
Structural Integrity
Measures:
- YAML syntactic correctness;
- Presence of mandatory sections;
- Overall template structure.
Detection Fidelity
Measures:
- Accuracy of generated HTTP requests;
- Correctness of matchers;
- OAST interaction logic;
- Similarity to the original template detection workflow.
Metadata Completeness
Measures:
- CVE identifiers;
- Severity information;
- References;
- Tags and auxiliary metadata.
Additional Validation
All generated templates were validated using:
nuclei -validate
with pass/fail reporting.
Results
| Model | Structural Integrity | Detection Fidelity | Metadata Completeness | Nuclei Validation Rate |
|---|---|---|---|---|
| Qwen2.5-Coder-7B-Instruct-Nuclei | 6.08 | 2.56 | 4.84 | 96% (24/25) |
| Qwen2.5-Coder-3B-Instruct-Nuclei | 3.92 | 1.56 | 3.20 | 16% (4/25) |
| ChatGPT-5.4 | 5.76 | 2.12 | 4.24 | 64% (16/25) |
| Qwen2.5-Coder-7B-Instruct | 1.92 | 1.04 | 2.60 | 0% (0/25) |
| Qwen2.5-Coder-3B-Instruct | 2.20 | 0.96 | 1.80 | 0% (0/25) |
The fine-tuned model significantly outperforms its base version and compares favorably with commercial general-purpose models, particularly in:
- Detection fidelity;
- Template validity;
- Nuclei validation success rate.
Summary
The model effectively automates the generation of syntactically valid and functionally accurate Nuclei templates from structured vulnerability descriptions.
It can serve as a practical assistant for:
- Security researchers;
- Penetration testers;
- Vulnerability analysts;
- Detection engineers.
Environmental Impact
| Metric | Value |
|---|---|
| Hardware | NVIDIA A100 (40x2) GB |
| Usage Time | 1.5 hours |
| Cloud Provider | Google Colab |
Technical Specifications
Model Architecture and Objective
The model uses the same architecture as Qwen2.5-Coder-7B-Instruct, a decoder-only Transformer.
Training objective:
- Causal Language Modeling (CLM);
- Loss computed exclusively on the response tokens.
Compute Infrastructure
Hardware
Training
- 1 × NVIDIA A100 80 GB GPU
- 40 GB system RAM
Inference
The model can run on:
- On a GPU with 8-16 GB: using 4-bit GGUF quantization;
- On a GPU with more than 16 GB: just convert to GGUF without quantization.
Software
| Component | Version |
|---|---|
| Transformers | 5.0.0 |
| PEFT | 0.19.1 |
| TRL | 1.5.0 |
| PyTorch | 2.10.0+cu128 |
| Datasets | 4.0.0 |
- Downloads last month
- 38
Model tree for NormanRey/Qwen2.5-Coder-7B-Instruct-Nuclei
Evaluation results
- structural_integrity (LLM-as-a-Judge) on nuclei-template-test-setself-reported6.080
- detection_fidelity (LLM-as-a-Judge) on nuclei-template-test-setself-reported2.560
- metadata_completeness (LLM-as-a-Judge) on nuclei-template-test-setself-reported4.840
- nuclei_validation_rate on nuclei-template-test-setself-reported0.960