RedSage-Qwen3-8B-DPO
Model Summary
RedSage-Qwen3-8B-DPO is the final, aligned version of the RedSage cybersecurity LLM series developed by RISysLab. It represents the fourth and final stage of the RedSage training pipeline.
This model is fine-tuned from RedSage-Qwen3-8B-Ins using Direct Preference Optimization (DPO) on the AllenAI Tulu 3 Preference Mixture. This alignment stage significantly enhances the model's general reasoning capabilities and safety behaviors while maintaining the deep cybersecurity domain expertise acquired during previous stages.
Developed by: RISysLab
Repository: GitHub
Base Model: RISys-Lab/RedSage-Qwen3-8B-Ins
Training Lineage
RedSage employs a multi-stage training pipeline. This model represents the output of Stage 4.
- Stage 1: Continual Pre-Training (CPT) -> RedSage-Qwen3-8B-CFW
- Stage 2: Targeted Pre-Training -> RedSage-Qwen3-8B-Base
- Stage 3: Supervised Fine-Tuning (SFT) -> RedSage-Qwen3-8B-Ins
- Stage 4: Direct Preference Optimization (DPO) ->
RedSage-Qwen3-8B-DPO(Current Model)- Data: Tulu 3 Preference Mixture
Dataset: Preference Alignment
The model was aligned using the following high-quality preference dataset to ensure robust instruction following and general reasoning:
- Dataset:
allenai/llama-3.1-tulu-3-8b-preference-mixture - Description: A comprehensive collection of preference data used to align the Tulu 3 models, focusing on helpfulness, factuality, and safety.
Performance & Evaluation
RedSage-Qwen3-8B-DPO achieves the best balance between specialized domain knowledge and general capability among all RedSage variants.
1. RedSage-Bench (0-shot)
| Category | Qwen3-8B (non-reasoning) | RedSage-8B-DPO |
|---|---|---|
| Macro Average | 81.85 | 84.83 |
| Knowledge (General) | 80.46 | 82.48 |
| Knowledge (Frameworks) | 78.82 | 83.80 |
| Skill (Offensive) | 86.16 | 88.54 |
| Tools (CLI) | 83.92 | 86.30 |
| Tools (Kali) | 75.56 | 79.30 |
2. External Cybersecurity Benchmarks (0-shot)
| Benchmark | Qwen3-8B (non-reasoning) | RedSage-8B-DPO |
|---|---|---|
| Mean | 75.71 | 81.10 |
| CTI-Bench (MCQ) | 62.76 | 70.84 |
| CTI-Bench (RCM) | 54.00 | 70.60 |
| CyberMetric (500) | 88.60 | 90.00 |
| MMLU (Security) | 76.00 | 79.00 |
| SecBench (En) | 73.26 | 80.06 |
| SecEva (MCQ) | 65.46 | 74.22 |
| SECURE (CWET) | 88.11 | 91.35 |
| SECURE (KCV) | 87.42 | 82.86 |
| SECURE (MEAT) | 85.75 | 91.00 |
3. OpenLLM Leaderboard (General Benchmark)
| Benchmark | Qwen3-8B (non-reasoning) | RedSage-8B-DPO |
|---|---|---|
| Mean | 65.92 | 74.33 |
| MMLU | 73.59 | 77.07 |
| ARC-C | 62.54 | 71.76 |
| GSM8K | 75.66 | 82.71 |
| HellaSwag | 56.70 | 79.87 |
| TruthfulQA | 45.23 | 52.47 |
| WinoGrande | 62.51 | 73.01 |
| IFEval | 85.21 | 83.44 |
Usage
Use the standard chat template for inference.
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "RISys-Lab/RedSage-Qwen3-8B-DPO"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Define the chat messages
messages = [
{"role": "system", "content": "You are RedSage, a helpful cybersecurity assistant."},
{"role": "user", "content": "Analyze the following log entry for potential indicators of compromise: 'POST /cgi-bin/test-cgi?* HTTP/1.1'"}
]
# Apply chat template
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Intended Use
- Primary Use: General-purpose cybersecurity assistance, log analysis, threat intelligence summarization, and educational queries.
- Benefits: Better instruction adherence based on human preference compared to the SFT-only version.
- Limitations: While aligned, the model may still produce incorrect information. Always verify outputs in critical security environments.
Citation
If you use this model or dataset, please cite our paper:
@inproceedings{suryanto2026redsage,
title={RedSage: A Cybersecurity Generalist {LLM}},
author={Naufal Suryanto and Muzammal Naseer and Pengfei Li and Syed Talal Wasim and Jinhui Yi and Juergen Gall and Paolo Ceravolo and Ernesto Damiani},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=W4FAenIrQ2}
}
- Downloads last month
- 53
Model tree for RISys-Lab/RedSage-Qwen3-8B-DPO
Base model
Qwen/Qwen3-8B-Base