Instructions to use oyildirim/CyberStrike-OffSec-35B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use oyildirim/CyberStrike-OffSec-35B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="oyildirim/CyberStrike-OffSec-35B")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("oyildirim/CyberStrike-OffSec-35B")
model = AutoModelForMultimodalLM.from_pretrained("oyildirim/CyberStrike-OffSec-35B")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use oyildirim/CyberStrike-OffSec-35B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "oyildirim/CyberStrike-OffSec-35B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "oyildirim/CyberStrike-OffSec-35B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/oyildirim/CyberStrike-OffSec-35B

SGLang

How to use oyildirim/CyberStrike-OffSec-35B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "oyildirim/CyberStrike-OffSec-35B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "oyildirim/CyberStrike-OffSec-35B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "oyildirim/CyberStrike-OffSec-35B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "oyildirim/CyberStrike-OffSec-35B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use oyildirim/CyberStrike-OffSec-35B with Docker Model Runner:
```
docker model run hf.co/oyildirim/CyberStrike-OffSec-35B
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

CyberStrike-OffSec-35B

The #1 Ranked Open-Source Model for Cybersecurity & Offensive Security

Outperforms GPT-4-turbo on SecEval | Outperforms GPT-4 on MITRE ATT&CK & CWE benchmarks

Quantized • Benchmarks • Quick Start • Model Details • Training • Architecture • Use Cases • Citation

What is CyberStrike?

CyberStrike-OffSec-35B is a domain-specialized large language model built for offensive security professionals, penetration testers, and security researchers. Fine-tuned on Qwen3.6-35B-A3B using a two-stage pipeline (SFT + DPO), it delivers expert-level knowledge across the entire offensive security lifecycle:

Vulnerability Discovery — SQL injection, XSS, SSRF, deserialization, business logic flaws
MITRE ATT&CK Operations — Technique identification, kill chain analysis, threat mapping
Exploit Development — PoC creation, payload crafting, evasion techniques
Cloud & Infrastructure — AWS/Azure/GCP misconfigurations, container escapes, IAM abuse
Red Team Operations — C2 setup, lateral movement, persistence, EDR evasion
Compliance & Standards — NIST, OWASP ASVS, CIS benchmarks, CVSS scoring

Model Format: This is the full-precision BF16 model (67 GB, 26 safetensors shards). For quantized versions, see below.

Available Versions

Repo	Format	Size	Use Case
oyildirim/CyberStrike-OffSec-35B	BF16 (full precision)	67 GB	Transformers, vLLM, fine-tuning
oyildirim/CyberStrike-OffSec-35B-GGUF	GGUF Q8_0	36 GB	llama.cpp, Ollama, LM Studio
oyildirim/CyberStrike-OffSec-35B-GGUF	GGUF Q6_K	27 GB	llama.cpp, Ollama, LM Studio
oyildirim/CyberStrike-OffSec-35B-GGUF	GGUF Q5_K_M	24 GB	llama.cpp, Ollama, LM Studio
oyildirim/CyberStrike-OffSec-35B-GGUF	GGUF Q4_K_M	21 GB	llama.cpp, Ollama, LM Studio

Benchmark Results

CyberStrike achieves state-of-the-art results on multiple cybersecurity benchmarks, outperforming GPT-4-turbo, GPT-4, and all other evaluated models on domain-specific evaluations.

SecEval — #1 on Leaderboard

Outperforms GPT-4-turbo by +2.32 points across 9 cybersecurity domains, 2,189 questions.

Rank	Model	Overall	Network Sec	Web Sec	PenTest	Cryptography
#1	CyberStrike-OffSec-35B	81.39%	85.09%	85.34%	82.26%	75.00%
#2	GPT-4-turbo	79.07%	75.65%	82.15%	80.00%	64.29%
#3	GPT-3.5-turbo	62.09%	60.87%	63.00%	72.00%	35.71%
#4	Yi-6B	53.57%	56.52%	54.98%	69.26%	35.71%

Full SecEval Domain Breakdown (9 domains)

Domain	CyberStrike	GPT-4-turbo	Delta
Network Security	85.09%	75.65%	+9.44
Web Security	85.34%	82.15%	+3.19
Vulnerability	83.33%	76.05%	+7.28
Application Security	82.29%	75.25%	+7.04
PenTest	82.26%	80.00%	+2.26
Software Security	79.75%	73.28%	+6.47
System Security	77.82%	73.61%	+4.21
Cryptography	75.00%	64.29%	+10.71
Memory Safety	71.43%	70.83%	+0.60

CyberStrike leads in all 9 domains. Largest improvement: Cryptography (+10.71) and Network Security (+9.44).

SECURE — #1 on MITRE ATT&CK & CWE Tasks

Outperforms GPT-4 by +5.34 points on MITRE ATT&CK extraction. Evaluated on ICS cybersecurity scenarios.

Task	CyberStrike	GPT-4	Llama3-70B	Gemini-Pro
MAET (MITRE ATT&CK)	93.94%	88.6%	86.3%	86.2%
CWET (CWE Knowledge)	93.05%	89.6%	90.4%	87.8%

CyberMetric-10000 — #4 out of 25 Models

9,189 expert-validated cybersecurity MCQ questions across NIST, RFC, and industry standards.

Rank	Model	Score
#1	GPT-4o	88.89%
#2	GPT-4-turbo	88.50%
#3	GEMINI-pro 1.0	87.50%
#4	CyberStrike-OffSec-35B	86.61%
#5	Mixtral-8x7B-Instruct	87.00%
#6	Falcon-180B-Chat	87.00%
#7	GPT-3.5-turbo	80.30%

General Benchmarks (lm-evaluation-harness, 0-shot)

Benchmark	Score
MMLU (overall)	76.94%
MMLU — Social Sciences	86.81%
MMLU — Computer Security	86.00%
MMLU — Other	81.43%
MMLU — Security Studies	80.00%
MMLU — STEM	73.87%
MMLU — Humanities	69.59%
HellaSwag (acc_norm)	79.61%
ARC Easy	81.86%
ARC Challenge (acc_norm)	59.13%
WinoGrande	72.22%
TruthfulQA MC2	49.64%

Note: General benchmarks run at 0-shot. Few-shot performance expected to be higher.

Quick Start

Ollama (Easiest)

# Download and run the Q4_K_M quantized version
ollama run hf.co/oyildirim/CyberStrike-OffSec-35B-GGUF:Q4_K_M

llama.cpp

# Download the GGUF file from https://huggingface.co/oyildirim/CyberStrike-OffSec-35B-GGUF
./llama-cli -m CyberStrike-OffSec-35B-Q4_K_M.gguf \
  -p "Explain SSRF exploitation in cloud environments" \
  -n 512 --temp 0.7

Transformers

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model = AutoModelForCausalLM.from_pretrained(
    "oyildirim/CyberStrike-OffSec-35B",
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained(
    "oyildirim/CyberStrike-OffSec-35B",
    trust_remote_code=True,
)

messages = [
    {"role": "user", "content": "Explain SSRF exploitation in cloud environments with AWS metadata service abuse."}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=2048, do_sample=True, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

vLLM (Recommended for Production)

pip install vllm

vllm serve oyildirim/CyberStrike-OffSec-35B \
  --dtype bfloat16 \
  --max-model-len 4096 \
  --trust-remote-code \
  --served-model-name CyberStrike-OffSec-35B

Then use the OpenAI-compatible API:

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="not-needed")
response = client.chat.completions.create(
    model="CyberStrike-OffSec-35B",
    messages=[{"role": "user", "content": "How to exploit deserialization vulnerabilities in Java applications?"}],
    max_tokens=2048,
)
print(response.choices[0].message.content)

Model Details

Property	Value
Base Model	Qwen3.6-35B-A3B
Type	Mixture-of-Experts (MoE)
Total Parameters	35 Billion
Active Parameters	~3 Billion per token
Precision	BF16 (Brain Float 16)
Model Size	67 GB (26 safetensors shards)
Context Length	4,096 tokens (training) / 262,144 max (architecture)
Training Method	SFT + DPO (QLoRA)
Training Hardware	NVIDIA H200 140GB SXM
License	Apache 2.0

Training Pipeline

CyberStrike was trained using a two-stage alignment pipeline:

Stage 1: Supervised Fine-Tuning (SFT)

The base Qwen3.6-35B-A3B model was fine-tuned on a curated dataset of offensive security scenarios covering 10 categories:

web_app cloud post_exploitation edr_evasion malware_dev network social_engineering full_kill_chain lateral_movement persistence

Method: QLoRA (4-bit NF4 quantization)
LoRA Config: r=64, alpha=128, dropout=0
Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

Stage 2: Direct Preference Optimization (DPO)

The SFT model was further aligned using 115,250 preference pairs across 12 carefully designed axes, teaching the model to produce expert-level responses over superficial ones:

Axis	Description	Examples
MITRE ATT&CK Depth	Deep technique analysis over surface-level summaries	T1059 sub-technique breakdowns
CVE Analysis	Detailed vulnerability analysis with CVSS scoring	CVE-2024-* exploit chains
OWASP Methodology	Structured testing methodology	ASVS compliance checks
Cloud Security	Provider-specific attack paths	AWS IAM, Azure AD, GCP abuse
Tool Usage	Proper tool invocation patterns	Nmap, Burp, sqlmap workflows
ReAct Reasoning	Step-by-step analytical thinking	Multi-stage attack planning
Multi-turn Engagement	Sustained deep conversation	Progressive pentest engagement
Code-first Approach	Working exploit code over theory	PoC development, payload crafting
Techstack Analysis	Technology-specific vulnerabilities	Framework-specific attacks
Sub-agent Coordination	Orchestrated multi-tool operations	Combined recon + exploit chains
Business Logic	Domain-aware vulnerability assessment	Sector-specific attack scenarios
NIST Compliance	Standards-aligned security assessment	SP 800-53 control mapping

Method: QLoRA, LoRA r=32, alpha=64
DPO Beta: 0.1
Learning Rate: 5e-6 with cosine schedule
Effective Batch Size: 8
Training Steps: 9,142

Architecture

Qwen3.6-35B-A3B (Mixture-of-Experts)
├── 35B total parameters
├── ~3B active parameters per token
├── 256 experts, top-k routing
├── Grouped Query Attention (GQA)
├── RoPE positional encoding (theta=10M)
├── Max position embeddings: 262,144
└── BF16 precision (67 GB on disk)

The MoE architecture provides a unique advantage: expert-level knowledge at inference costs comparable to a 3B model, while having the knowledge capacity of a 35B model.

Use Cases

CyberStrike is designed for professionals conducting authorized security assessments:

Penetration Testing — Web app, network, cloud, and API security testing
Red Team Operations — Full kill chain simulation, C2 operations, evasion
Vulnerability Research — CVE analysis, exploit development, PoC creation
CTF Competitions — Challenge solving, reverse engineering, cryptography
Security Education — Training material generation, exam preparation
Threat Intelligence — MITRE ATT&CK mapping, threat actor TTPs
Compliance Assessment — NIST, OWASP, CIS benchmark evaluation

Ethical Use & Disclaimer

This model is intended exclusively for authorized security testing, education, and research purposes. Users must:

Obtain proper written authorization before testing any systems
Comply with all applicable laws and regulations
Follow responsible disclosure practices
Never use this model for unauthorized access or malicious activities

The authors are not responsible for any misuse of this model.

Citation

@misc{cyberstrike2025,
  title={CyberStrike-OffSec-35B: A Domain-Specialized LLM for Offensive Security},
  author={Orhan Yildirim},
  year={2025},
  url={https://huggingface.co/oyildirim/CyberStrike-OffSec-35B}
}

Built with purpose. Benchmarked with rigor. Designed for professionals.