Instructions to use oyildirim/CyberStrike-OffSec-35B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use oyildirim/CyberStrike-OffSec-35B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="oyildirim/CyberStrike-OffSec-35B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("oyildirim/CyberStrike-OffSec-35B") model = AutoModelForMultimodalLM.from_pretrained("oyildirim/CyberStrike-OffSec-35B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use oyildirim/CyberStrike-OffSec-35B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "oyildirim/CyberStrike-OffSec-35B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "oyildirim/CyberStrike-OffSec-35B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/oyildirim/CyberStrike-OffSec-35B
- SGLang
How to use oyildirim/CyberStrike-OffSec-35B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "oyildirim/CyberStrike-OffSec-35B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "oyildirim/CyberStrike-OffSec-35B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "oyildirim/CyberStrike-OffSec-35B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "oyildirim/CyberStrike-OffSec-35B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use oyildirim/CyberStrike-OffSec-35B with Docker Model Runner:
docker model run hf.co/oyildirim/CyberStrike-OffSec-35B
CyberStrike-OffSec-35B
The #1 Ranked Open-Source Model for Cybersecurity & Offensive Security
Outperforms GPT-4-turbo on SecEval | Outperforms GPT-4 on MITRE ATT&CK & CWE benchmarks
Quantized • Benchmarks • Quick Start • Model Details • Training • Architecture • Use Cases • Citation
What is CyberStrike?
CyberStrike-OffSec-35B is a domain-specialized large language model built for offensive security professionals, penetration testers, and security researchers. Fine-tuned on Qwen3.6-35B-A3B using a two-stage pipeline (SFT + DPO), it delivers expert-level knowledge across the entire offensive security lifecycle:
- Vulnerability Discovery — SQL injection, XSS, SSRF, deserialization, business logic flaws
- MITRE ATT&CK Operations — Technique identification, kill chain analysis, threat mapping
- Exploit Development — PoC creation, payload crafting, evasion techniques
- Cloud & Infrastructure — AWS/Azure/GCP misconfigurations, container escapes, IAM abuse
- Red Team Operations — C2 setup, lateral movement, persistence, EDR evasion
- Compliance & Standards — NIST, OWASP ASVS, CIS benchmarks, CVSS scoring
Model Format: This is the full-precision BF16 model (67 GB, 26 safetensors shards). For quantized versions, see below.
Available Versions
| Repo | Format | Size | Use Case |
|---|---|---|---|
| oyildirim/CyberStrike-OffSec-35B | BF16 (full precision) | 67 GB | Transformers, vLLM, fine-tuning |
| oyildirim/CyberStrike-OffSec-35B-GGUF | GGUF Q8_0 | 36 GB | llama.cpp, Ollama, LM Studio |
| oyildirim/CyberStrike-OffSec-35B-GGUF | GGUF Q6_K | 27 GB | llama.cpp, Ollama, LM Studio |
| oyildirim/CyberStrike-OffSec-35B-GGUF | GGUF Q5_K_M | 24 GB | llama.cpp, Ollama, LM Studio |
| oyildirim/CyberStrike-OffSec-35B-GGUF | GGUF Q4_K_M | 21 GB | llama.cpp, Ollama, LM Studio |
Benchmark Results
CyberStrike achieves state-of-the-art results on multiple cybersecurity benchmarks, outperforming GPT-4-turbo, GPT-4, and all other evaluated models on domain-specific evaluations.
SecEval — #1 on Leaderboard
Outperforms GPT-4-turbo by +2.32 points across 9 cybersecurity domains, 2,189 questions.
| Rank | Model | Overall | Network Sec | Web Sec | PenTest | Cryptography |
|---|---|---|---|---|---|---|
| #1 | CyberStrike-OffSec-35B | 81.39% | 85.09% | 85.34% | 82.26% | 75.00% |
| #2 | GPT-4-turbo | 79.07% | 75.65% | 82.15% | 80.00% | 64.29% |
| #3 | GPT-3.5-turbo | 62.09% | 60.87% | 63.00% | 72.00% | 35.71% |
| #4 | Yi-6B | 53.57% | 56.52% | 54.98% | 69.26% | 35.71% |
Full SecEval Domain Breakdown (9 domains)
| Domain | CyberStrike | GPT-4-turbo | Delta |
|---|---|---|---|
| Network Security | 85.09% | 75.65% | +9.44 |
| Web Security | 85.34% | 82.15% | +3.19 |
| Vulnerability | 83.33% | 76.05% | +7.28 |
| Application Security | 82.29% | 75.25% | +7.04 |
| PenTest | 82.26% | 80.00% | +2.26 |
| Software Security | 79.75% | 73.28% | +6.47 |
| System Security | 77.82% | 73.61% | +4.21 |
| Cryptography | 75.00% | 64.29% | +10.71 |
| Memory Safety | 71.43% | 70.83% | +0.60 |
CyberStrike leads in all 9 domains. Largest improvement: Cryptography (+10.71) and Network Security (+9.44).
SECURE — #1 on MITRE ATT&CK & CWE Tasks
Outperforms GPT-4 by +5.34 points on MITRE ATT&CK extraction. Evaluated on ICS cybersecurity scenarios.
| Task | CyberStrike | GPT-4 | Llama3-70B | Gemini-Pro |
|---|---|---|---|---|
| MAET (MITRE ATT&CK) | 93.94% | 88.6% | 86.3% | 86.2% |
| CWET (CWE Knowledge) | 93.05% | 89.6% | 90.4% | 87.8% |
CyberMetric-10000 — #4 out of 25 Models
9,189 expert-validated cybersecurity MCQ questions across NIST, RFC, and industry standards.
| Rank | Model | Score |
|---|---|---|
| #1 | GPT-4o | 88.89% |
| #2 | GPT-4-turbo | 88.50% |
| #3 | GEMINI-pro 1.0 | 87.50% |
| #4 | CyberStrike-OffSec-35B | 86.61% |
| #5 | Mixtral-8x7B-Instruct | 87.00% |
| #6 | Falcon-180B-Chat | 87.00% |
| #7 | GPT-3.5-turbo | 80.30% |
General Benchmarks (lm-evaluation-harness, 0-shot)
| Benchmark | Score |
|---|---|
| MMLU (overall) | 76.94% |
| MMLU — Social Sciences | 86.81% |
| MMLU — Computer Security | 86.00% |
| MMLU — Other | 81.43% |
| MMLU — Security Studies | 80.00% |
| MMLU — STEM | 73.87% |
| MMLU — Humanities | 69.59% |
| HellaSwag (acc_norm) | 79.61% |
| ARC Easy | 81.86% |
| ARC Challenge (acc_norm) | 59.13% |
| WinoGrande | 72.22% |
| TruthfulQA MC2 | 49.64% |
Note: General benchmarks run at 0-shot. Few-shot performance expected to be higher.
Quick Start
Ollama (Easiest)
# Download and run the Q4_K_M quantized version
ollama run hf.co/oyildirim/CyberStrike-OffSec-35B-GGUF:Q4_K_M
llama.cpp
# Download the GGUF file from https://huggingface.co/oyildirim/CyberStrike-OffSec-35B-GGUF
./llama-cli -m CyberStrike-OffSec-35B-Q4_K_M.gguf \
-p "Explain SSRF exploitation in cloud environments" \
-n 512 --temp 0.7
Transformers
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model = AutoModelForCausalLM.from_pretrained(
"oyildirim/CyberStrike-OffSec-35B",
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained(
"oyildirim/CyberStrike-OffSec-35B",
trust_remote_code=True,
)
messages = [
{"role": "user", "content": "Explain SSRF exploitation in cloud environments with AWS metadata service abuse."}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=2048, do_sample=True, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
vLLM (Recommended for Production)
pip install vllm
vllm serve oyildirim/CyberStrike-OffSec-35B \
--dtype bfloat16 \
--max-model-len 4096 \
--trust-remote-code \
--served-model-name CyberStrike-OffSec-35B
Then use the OpenAI-compatible API:
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key="not-needed")
response = client.chat.completions.create(
model="CyberStrike-OffSec-35B",
messages=[{"role": "user", "content": "How to exploit deserialization vulnerabilities in Java applications?"}],
max_tokens=2048,
)
print(response.choices[0].message.content)
Model Details
| Property | Value |
|---|---|
| Base Model | Qwen3.6-35B-A3B |
| Type | Mixture-of-Experts (MoE) |
| Total Parameters | 35 Billion |
| Active Parameters | ~3 Billion per token |
| Precision | BF16 (Brain Float 16) |
| Model Size | 67 GB (26 safetensors shards) |
| Context Length | 4,096 tokens (training) / 262,144 max (architecture) |
| Training Method | SFT + DPO (QLoRA) |
| Training Hardware | NVIDIA H200 140GB SXM |
| License | Apache 2.0 |
Training Pipeline
CyberStrike was trained using a two-stage alignment pipeline:
Stage 1: Supervised Fine-Tuning (SFT)
The base Qwen3.6-35B-A3B model was fine-tuned on a curated dataset of offensive security scenarios covering 10 categories:
web_app cloud post_exploitation edr_evasion malware_dev network social_engineering full_kill_chain lateral_movement persistence
- Method: QLoRA (4-bit NF4 quantization)
- LoRA Config: r=64, alpha=128, dropout=0
- Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Stage 2: Direct Preference Optimization (DPO)
The SFT model was further aligned using 115,250 preference pairs across 12 carefully designed axes, teaching the model to produce expert-level responses over superficial ones:
| Axis | Description | Examples |
|---|---|---|
| MITRE ATT&CK Depth | Deep technique analysis over surface-level summaries | T1059 sub-technique breakdowns |
| CVE Analysis | Detailed vulnerability analysis with CVSS scoring | CVE-2024-* exploit chains |
| OWASP Methodology | Structured testing methodology | ASVS compliance checks |
| Cloud Security | Provider-specific attack paths | AWS IAM, Azure AD, GCP abuse |
| Tool Usage | Proper tool invocation patterns | Nmap, Burp, sqlmap workflows |
| ReAct Reasoning | Step-by-step analytical thinking | Multi-stage attack planning |
| Multi-turn Engagement | Sustained deep conversation | Progressive pentest engagement |
| Code-first Approach | Working exploit code over theory | PoC development, payload crafting |
| Techstack Analysis | Technology-specific vulnerabilities | Framework-specific attacks |
| Sub-agent Coordination | Orchestrated multi-tool operations | Combined recon + exploit chains |
| Business Logic | Domain-aware vulnerability assessment | Sector-specific attack scenarios |
| NIST Compliance | Standards-aligned security assessment | SP 800-53 control mapping |
- Method: QLoRA, LoRA r=32, alpha=64
- DPO Beta: 0.1
- Learning Rate: 5e-6 with cosine schedule
- Effective Batch Size: 8
- Training Steps: 9,142
Architecture
Qwen3.6-35B-A3B (Mixture-of-Experts)
├── 35B total parameters
├── ~3B active parameters per token
├── 256 experts, top-k routing
├── Grouped Query Attention (GQA)
├── RoPE positional encoding (theta=10M)
├── Max position embeddings: 262,144
└── BF16 precision (67 GB on disk)
The MoE architecture provides a unique advantage: expert-level knowledge at inference costs comparable to a 3B model, while having the knowledge capacity of a 35B model.
Use Cases
CyberStrike is designed for professionals conducting authorized security assessments:
- Penetration Testing — Web app, network, cloud, and API security testing
- Red Team Operations — Full kill chain simulation, C2 operations, evasion
- Vulnerability Research — CVE analysis, exploit development, PoC creation
- CTF Competitions — Challenge solving, reverse engineering, cryptography
- Security Education — Training material generation, exam preparation
- Threat Intelligence — MITRE ATT&CK mapping, threat actor TTPs
- Compliance Assessment — NIST, OWASP, CIS benchmark evaluation
Ethical Use & Disclaimer
This model is intended exclusively for authorized security testing, education, and research purposes. Users must:
- Obtain proper written authorization before testing any systems
- Comply with all applicable laws and regulations
- Follow responsible disclosure practices
- Never use this model for unauthorized access or malicious activities
The authors are not responsible for any misuse of this model.
Citation
@misc{cyberstrike2025,
title={CyberStrike-OffSec-35B: A Domain-Specialized LLM for Offensive Security},
author={Orhan Yildirim},
year={2025},
url={https://huggingface.co/oyildirim/CyberStrike-OffSec-35B}
}
- Downloads last month
- 18
Model tree for oyildirim/CyberStrike-OffSec-35B
Evaluation results
- Overall Accuracy on SecEvalself-reported81.390
- Overall Accuracy on CyberMetric-10000self-reported86.610
- MAET Accuracy on SECURE-MAETself-reported93.940
- CWET Accuracy on SECURE-CWETself-reported93.050
- Overall Accuracy on MMLUself-reported76.940
