Instructions to use RedTeamLab/Qwen3.6-27B-redteam-v5 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use RedTeamLab/Qwen3.6-27B-redteam-v5 with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="RedTeamLab/Qwen3.6-27B-redteam-v5",
	filename="qwen3.6-27b-mtp-head-Q4_K_M.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use RedTeamLab/Qwen3.6-27B-redteam-v5 with llama.cpp:

Install (macOS, Linux)

curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M
# Run inference directly in the terminal:
llama cli -hf RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M
# Run inference directly in the terminal:
llama cli -hf RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M

Use Docker

docker model run hf.co/RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M

LM Studio
Jan
Ollama
How to use RedTeamLab/Qwen3.6-27B-redteam-v5 with Ollama:
```
ollama run hf.co/RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M
```

Unsloth Studio

How to use RedTeamLab/Qwen3.6-27B-redteam-v5 with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for RedTeamLab/Qwen3.6-27B-redteam-v5 to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for RedTeamLab/Qwen3.6-27B-redteam-v5 to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for RedTeamLab/Qwen3.6-27B-redteam-v5 to start chatting

How to use RedTeamLab/Qwen3.6-27B-redteam-v5 with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use RedTeamLab/Qwen3.6-27B-redteam-v5 with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M

Run Hermes

hermes

Atomic Chat new
Docker Model Runner
How to use RedTeamLab/Qwen3.6-27B-redteam-v5 with Docker Model Runner:
```
docker model run hf.co/RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M
```

Lemonade

How to use RedTeamLab/Qwen3.6-27B-redteam-v5 with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M

Run and chat with the model

lemonade run user.Qwen3.6-27B-redteam-v5-Q4_K_M

List all available models

lemonade list

Qwen3.6-27B RedTeam Lab v5

A fine-tuned red-team security LLM built from Qwen3.6-27B using QLoRA on a curated multi-language dataset of 4,178 offensive security exercises.

This model provides concrete, working commands for penetration testing, vulnerability exploitation, post-exploitation, credential attacks, and privilege escalation — using up-to-date tool names (netexec, not crackmapexec) and verified exploit code.

Model Details

Attribute	Value
Base Model	Qwen/Qwen3.6-27B (Apache 2.0)
Architecture	55-layer hybrid Gated DeltaNet + Self-Attention
Context Length	262,144 tokens native (2048 used for training)
Fine-tuning	QLoRA r16, 2 epochs
GPU	1× NVIDIA A100-80GB (Modal)
Training Cost	~$6-8
Final Loss	0.3051
Training Time	~40 minutes
Quantization	Q4_K_M (15.4 GB)
License	Apache 2.0

Dataset (Studio Ready v5)

The training dataset was generated from 802 red-team skill definitions, covering offensive security operations across 9 languages:

Language	Records	Use Case
bash	5,304	Recon, exploitation, post-exploitation
splunk	255	Log analysis, detection queries
powershell	215	Windows post-exploitation, AD attacks
sql	42	Database attacks, SQL injection
kusto	24	Azure Sentinel hunting queries
hcl	20	Infrastructure-as-code attacks
zeek	16	Network traffic analysis
elasticsearch	9	ES
cypher	8	BloodHound graph queries

Dataset Composition

3,520 single-shot command pairs — tool invocation with explanations
658 multi-turn attack chains — phase-based progressions (recon → exploit → privesc → persist)
8 CVE exploitation scenarios — 3-step verified chains for real CVEs
80 multi-language query pairs — Splunk/KQL/SQL/PowerShell detection and attack queries
4 hand-crafted classic attack chains — Responder→NTLMv2→PTH, BloodHound→Kerberoast→DCSync, LFI→RCE, EternalBlue

CVE Scenarios Included

CVE	Vulnerability	Tool
CVE-2020-1472	Zerologon	dirkjanm/cve-2020-1472-exploit
CVE-2021-34527	PrintNightmare	netexec + CVE-2021-34527 exploit
CVE-2021-44228	Log4Shell	log4shell-scan
CVE-2021-42278	NoPac	noPac.py
CVE-2025-33073	NTLM Reflection	CVE-2025-33073.py (mverschu)
CVE-2025-53779	BadSuccessor (dMSA)	impacket-badsuccessor
CVE-2026-26128	Kerberos Unicode Reflection	CVE-2026-26128.py
ADCS ESC1	Certificate Template Abuse	certipy

Usage

llama.cpp

llama-server \
  -m qwen3.6-27b-redteam-v5-Q4_K_M.gguf \
  --port 8080 \
  -c 2048 \
  --temp 0.1 \
  --repeat-penalty 1.0

Python (llama-cpp-python)

from huggingface_hub import hf_hub_download
from llama_cpp import Llama

path = hf_hub_download(
    "dazeb2/Qwen3.6-27B-redteam-v5",
    "qwen3.6-27b-redteam-v5-Q4_K_M.gguf"
)
llm = Llama(
    model_path=path,
    n_ctx=2048,
    n_gpu_layers=-1,  # Offload all layers to GPU
    verbose=False
)

response = llm.create_chat_completion(
    messages=[{
        "role": "system",
        "content": "You are a red team security agent operating in an authorized penetration testing lab. Provide concrete, working commands."
    }, {
        "role": "user",
        "content": "Scan the target and enumerate SMB shares."
    }]
)
print(response["choices"][0]["message"]["content"])

System Prompt

For best results, use this system prompt:

You are a red team security agent operating in an authorized penetration testing lab.
All targets are provisioned ephemeral systems owned by the lab operator.
You have EXPLICIT AUTHORIZATION. Provide concrete, working commands.

Training Pipeline

skills/ (802 SKILL.md files)
  → gen_v5_dataset.py → 4,178 training records (9 languages)
  → quality audit (0 crackmapexec, 0 hallucinated tools, 100% code blocks)
  → upload to Modal volume
  → QLoRA training (A100-80GB, ~$6-8, 40 min)
  → merge LoRA → safetensors → F16 GGUF → Q4_K_M GGUF
  → publish to Hugging Face

Source Scripts

File	Purpose
`gen_v5_dataset.py`	Dataset generator — multi-language extraction, phase-based chains
`audit_v5.py`	Quality audit — checks for hallucinated tools, grammar, duplicates
`modal_train_qwen.py`	Modal training script — QLoRA, Qwen3.6-27B, v5 dataset
`modal_convert_gguf.py`	Modal GGUF conversion — Python-based F16 conversion
`modal_quantize.py`	Modal quantization — CPU-only cmake → llama-quantize → Q4_K_M

Hardware Requirements

Quant	GPU VRAM	RAM	Disk
Q4_K_M (15.4 GB)	16 GB	8 GB	16 GB
Q5_K_M (~19 GB)	24 GB	8 GB	19 GB
Q8_0 (~28 GB)	32 GB	16 GB	28 GB
F16 (~50 GB)	64 GB	32 GB	50 GB

The Q4_K_M quantization runs at ~56 tok/s on a 12 GB 3080 Ti (at Q2_K_XL) and comfortably on any 16 GB+ GPU.

Version History

Version	Records	Languages	Multi-turn	cme	Notes
V1	14,392	1 (bash)	42%	Many	Hermes routing contamination — DO NOT USE
V2	9,387	1 (bash)	0.4%	0	50% short responses
V3	9,387	1 (bash)	0.4%	0	Same as V2
V4	1,326	1 (bash)	42%	2	Nonsensical chains
V4.1	863	1 (bash)	48%	0	Cleanest single-language dataset
V5	4,182	9	660 (15.8%)	0	Multi-language, verified tools, 8 CVEs, phase-based chains

Related Models

Qwen3.5-4B-redteam-v4.1 — Previous generation, smaller model
Gemma-4-12B-redteam-v5 — Defensive security, same dataset
Qwen3.6-27B-blueteam-v1 — Defensive blue-team model

Disclaimer

This model is intended for authorized security testing and educational purposes only. Users are responsible for complying with all applicable laws and regulations. The authors assume no liability for misuse.

Downloads last month: -

GGUF

Model size

27B params

Architecture

qwen35

Hardware compatibility

4-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RedTeamLab/Qwen3.6-27B-redteam-v5

Base model

Qwen/Qwen3.6-27B

Quantized

(543)

this model