RedSage-Qwen3-8B-DPO

Cybersecurity DPO

Model Summary

RedSage-Qwen3-8B-DPO is the final, aligned version of the RedSage cybersecurity LLM series developed by RISysLab. It represents the fourth and final stage of the RedSage training pipeline.

This model is fine-tuned from RedSage-Qwen3-8B-Ins using Direct Preference Optimization (DPO) on the AllenAI Tulu 3 Preference Mixture. This alignment stage significantly enhances the model's general reasoning capabilities and safety behaviors while maintaining the deep cybersecurity domain expertise acquired during previous stages.

Training Lineage

RedSage employs a multi-stage training pipeline. This model represents the output of Stage 4.

  1. Stage 1: Continual Pre-Training (CPT) -> RedSage-Qwen3-8B-CFW
  2. Stage 2: Targeted Pre-Training -> RedSage-Qwen3-8B-Base
  3. Stage 3: Supervised Fine-Tuning (SFT) -> RedSage-Qwen3-8B-Ins
  4. Stage 4: Direct Preference Optimization (DPO) -> RedSage-Qwen3-8B-DPO (Current Model)
    • Data: Tulu 3 Preference Mixture

Dataset: Preference Alignment

The model was aligned using the following high-quality preference dataset to ensure robust instruction following and general reasoning:

  • Dataset: allenai/llama-3.1-tulu-3-8b-preference-mixture
  • Description: A comprehensive collection of preference data used to align the Tulu 3 models, focusing on helpfulness, factuality, and safety.

Performance & Evaluation

RedSage-Qwen3-8B-DPO achieves the best balance between specialized domain knowledge and general capability among all RedSage variants.

1. RedSage-Bench (0-shot)

Category Qwen3-8B (non-reasoning) RedSage-8B-DPO
Macro Average 81.85 84.83
Knowledge (General) 80.46 82.48
Knowledge (Frameworks) 78.82 83.80
Skill (Offensive) 86.16 88.54
Tools (CLI) 83.92 86.30
Tools (Kali) 75.56 79.30

2. External Cybersecurity Benchmarks (0-shot)

Benchmark Qwen3-8B (non-reasoning) RedSage-8B-DPO
Mean 75.71 81.10
CTI-Bench (MCQ) 62.76 70.84
CTI-Bench (RCM) 54.00 70.60
CyberMetric (500) 88.60 90.00
MMLU (Security) 76.00 79.00
SecBench (En) 73.26 80.06
SecEva (MCQ) 65.46 74.22
SECURE (CWET) 88.11 91.35
SECURE (KCV) 87.42 82.86
SECURE (MEAT) 85.75 91.00

3. OpenLLM Leaderboard (General Benchmark)

Benchmark Qwen3-8B (non-reasoning) RedSage-8B-DPO
Mean 65.92 74.33
MMLU 73.59 77.07
ARC-C 62.54 71.76
GSM8K 75.66 82.71
HellaSwag 56.70 79.87
TruthfulQA 45.23 52.47
WinoGrande 62.51 73.01
IFEval 85.21 83.44

Usage

Use the standard chat template for inference.

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "RISys-Lab/RedSage-Qwen3-8B-DPO"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id, 
    torch_dtype=torch.bfloat16, 
    device_map="auto"
)

# Define the chat messages
messages = [
    {"role": "system", "content": "You are RedSage, a helpful cybersecurity assistant."},
    {"role": "user", "content": "Analyze the following log entry for potential indicators of compromise: 'POST /cgi-bin/test-cgi?* HTTP/1.1'"}
]

# Apply chat template
text = tokenizer.apply_chat_template(
    messages, 
    tokenize=False, 
    add_generation_prompt=True
)

inputs = tokenizer(text, return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Intended Use

  • Primary Use: General-purpose cybersecurity assistance, log analysis, threat intelligence summarization, and educational queries.
  • Benefits: Better instruction adherence based on human preference compared to the SFT-only version.
  • Limitations: While aligned, the model may still produce incorrect information. Always verify outputs in critical security environments.

Citation

If you use this model or dataset, please cite our paper:

@inproceedings{suryanto2026redsage,
  title={RedSage: A Cybersecurity Generalist {LLM}},
  author={Naufal Suryanto and Muzammal Naseer and Pengfei Li and Syed Talal Wasim and Jinhui Yi and Juergen Gall and Paolo Ceravolo and Ernesto Damiani},
  booktitle={The Fourteenth International Conference on Learning Representations},
  year={2026},
  url={https://openreview.net/forum?id=W4FAenIrQ2}
}
Downloads last month
53
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RISys-Lab/RedSage-Qwen3-8B-DPO

Base model

Qwen/Qwen3-8B-Base
Finetuned
(1)
this model
Quantizations
2 models

Dataset used to train RISys-Lab/RedSage-Qwen3-8B-DPO

Collection including RISys-Lab/RedSage-Qwen3-8B-DPO