Instructions to use RedTeamLab/Qwen3.6-27B-redteam-v5 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use RedTeamLab/Qwen3.6-27B-redteam-v5 with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="RedTeamLab/Qwen3.6-27B-redteam-v5", filename="qwen3.6-27b-mtp-head-Q4_K_M.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use RedTeamLab/Qwen3.6-27B-redteam-v5 with llama.cpp:
Install (macOS, Linux)
curl -LsSf https://llama.app/install.sh | sh # Start a local OpenAI-compatible server with a web UI: llama serve -hf RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M # Run inference directly in the terminal: llama cli -hf RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama serve -hf RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M # Run inference directly in the terminal: llama cli -hf RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M
Use Docker
docker model run hf.co/RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M
- LM Studio
- Jan
- Ollama
How to use RedTeamLab/Qwen3.6-27B-redteam-v5 with Ollama:
ollama run hf.co/RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M
- Unsloth Studio
How to use RedTeamLab/Qwen3.6-27B-redteam-v5 with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for RedTeamLab/Qwen3.6-27B-redteam-v5 to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for RedTeamLab/Qwen3.6-27B-redteam-v5 to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for RedTeamLab/Qwen3.6-27B-redteam-v5 to start chatting
- Pi
How to use RedTeamLab/Qwen3.6-27B-redteam-v5 with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama serve -hf RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use RedTeamLab/Qwen3.6-27B-redteam-v5 with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama serve -hf RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M
Run Hermes
hermes
- Atomic Chat new
- Docker Model Runner
How to use RedTeamLab/Qwen3.6-27B-redteam-v5 with Docker Model Runner:
docker model run hf.co/RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M
- Lemonade
How to use RedTeamLab/Qwen3.6-27B-redteam-v5 with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull RedTeamLab/Qwen3.6-27B-redteam-v5:Q4_K_M
Run and chat with the model
lemonade run user.Qwen3.6-27B-redteam-v5-Q4_K_M
List all available models
lemonade list
llm.create_chat_completion(
messages = "No input example has been defined for this model task."
)Qwen3.6-27B RedTeam v5
Red-team offensive security LLM fine-tuned from Qwen/Qwen3.6-27B using QLoRA on RedTeamLab v5 dataset.
Trained on 3,760 records across 8 languages (bash, python, splunk, kusto, sql, powershell, zeek, cypher) covering real CVE exploitation scenarios (2025-2026).
Loss: 0.3051 | GPU: A100-80GB | Time: 40 min | Quant: Q4_K_M
Files
| File | Size | Description |
|---|---|---|
qwen3.6-27b-redteam-v5-Q4_K_M.gguf |
15.4 GB | Main model -- standalone inference |
qwen3.6-27b-mtp-head-Q4_K_M.gguf |
1.9 GB | MTP draft head -- for speculative decoding |
Inference
Standalone (no MTP)
llama-server -m qwen3.6-27b-redteam-v5-Q4_K_M.gguf -c 2048 -ngl 30
With MTP Speculative Decoding (2-3x faster)
llama-server -m qwen3.6-27b-redteam-v5-Q4_K_M.gguf --model-draft qwen3.6-27b-mtp-head-Q4_K_M.gguf --spec-type draft-mtp -c 2048 -ngl 30
Training Details
- Base model: Qwen/Qwen3.6-27B (Apache 2.0)
- Method: QLoRA (r=16, alpha=16, target modules: q_proj, k_proj, v_proj, o_proj, up_proj, down_proj, gate_proj)
- Sequence length: 2048
- Epochs: 2
- Batch size: 2 (gradient accumulation 4)
- Learning rate: 2e-4
- Loss: 0.3051 (final), converged from ~2.42
Dataset: RedTeamLab v5
Multi-language chain-of-thought red-team exercises covering:
| Category | Techniques | Languages |
|---|---|---|
| Reconnaissance | Port scanning, service enumeration, OS fingerprinting | bash, python, nmap |
| Exploitation | LFI/RFI, SQLi, command injection, XXE, SSRF | python, sql, bash |
| Privilege Escalation | SUID, capabilities, token manipulation, container escape | bash, powershell |
| Persistence | Cron, SSH keys, backdoors, LD_PRELOAD | bash, python, powershell |
| Credential Access | Dump hashes, LSASS, SAM, keylogging | powershell, python, bash |
| Lateral Movement | SSH hijack, RDP, PSRemoting, pass-the-hash | bash, powershell |
| Exfiltration | DNS tunneling, HTTP exfil, ICMP | bash, python, zeek |
| Active Directory | Kerberos abuse, DCSync, ACL attacks | powershell |
Real CVE scenarios covering 2025-2026 vulnerabilities.
Hardware Requirements
| Config | Min VRAM | Recommended |
|---|---|---|
| Q4_K_M inference | 12 GB | 16 GB GPU (RTX 3080 Ti / 4060 Ti) |
| Q2_K_XL inference | 8 GB | 12 GB GPU |
| Training (LoRA) | 80 GB | A100-80GB ($8/run on Modal) |
Related Models
- RedTeamLab/Qwen3.5-4B-redteam-v4.1 -- smaller 4B variant
- RedTeamLab/Gemma-4-12B-redteam-v5 -- Gemma-4 with same v5 dataset
- RedTeamLab/Qwen3.6-27B-blueteam-v1 -- defensive counterpart
- Downloads last month
- -
4-bit
Model tree for RedTeamLab/Qwen3.6-27B-redteam-v5
Base model
Qwen/Qwen3.6-27B
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="RedTeamLab/Qwen3.6-27B-redteam-v5", filename="", )