| | --- |
| | library_name: transformers |
| | tags: |
| | - generated_from_trainer |
| | - cybersecurity |
| | - continual-pretraining |
| | - targeted-pretraining |
| | - text-generation |
| | - casual-lm |
| | - risys-lab |
| | model-index: |
| | - name: RedSage-Qwen3-8B-Base |
| | results: [] |
| | language: |
| | - en |
| | base_model: |
| | - RISys-Lab/RedSage-Qwen3-8B-CFW |
| | pipeline_tag: text-generation |
| | --- |
| | |
| | # RedSage-Qwen3-8B-Base |
| |
|
| | <div align="center"> |
| | <img src="https://img.shields.io/badge/Task-Cybersecurity-red" alt="Cybersecurity"> |
| | <img src="https://img.shields.io/badge/Stage-Targeted_Pretraining-blue" alt="Targeted Pretraining"> |
| | </div> |
| |
|
| | ## Model Summary |
| |
|
| | **RedSage-Qwen3-8B-Base** is a cybersecurity-specialized Large Language Model (LLM) developed by **RISys-Lab**. It represents the **second stage** of the RedSage pre-training pipeline. |
| |
|
| | This model builds upon **RedSage-Qwen3-8B-CFW** by undergoing **Targeted Pre-Training** on high-quality, curated cybersecurity resources (`RedSage-Seed` and `RedSage-Dump`). While the previous stage focused on breadth using web data, this stage focuses on depth, technical standards, and verified skills. |
| |
|
| | - **Paper:** [RedSage: A Cybersecurity Generalist LLM](https://openreview.net/forum?id=W4FAenIrQ2) ([arXiv](https://arxiv.org/abs/2601.22159)) |
| | - **Repository:** [GitHub](https://github.com/RISys-Lab/RedSage) |
| | - **Base Model:** [RISys-Lab/RedSage-Qwen3-8B-CFW](https://huggingface.co/RISys-Lab/RedSage-Qwen3-8B-CFW) |
| | - **Variant:** Base (Final Pre-trained Checkpoint) |
| |
|
| | ## Intended Use |
| |
|
| | This model is a **base model** intended for: |
| | 1. **Fine-tuning:** Serving as a high-quality foundation for downstream cybersecurity tasks (e.g., incident response, malware analysis). |
| | 2. **Research:** Investigating the impact of curated versus web-scale data in domain adaptation. |
| | 3. **Completion:** Code completion and technical writing in cybersecurity contexts. |
| |
|
| | **Note:** As a base model, this checkpoint has **not** been instruction-tuned (SFT) or aligned (DPO). It behaves like a completion engine. For a chat-ready assistant, please see `RISys-Lab/RedSage-Qwen3-8B-DPO`. |
| |
|
| | ## Training Lineage |
| |
|
| | RedSage employs a multi-stage training pipeline. This model represents the output of **Stage 2**. |
| |
|
| | 1. Stage 1: Continual Pre-Training (CPT) -> [RedSage-Qwen3-8B-CFW](https://huggingface.co/RISys-Lab/RedSage-Qwen3-8B-CFW) (CyberFineWeb data) |
| | 2. **Stage 2: Targeted Pre-Training** -> **`RedSage-Qwen3-8B-Base`** (Current Model) |
| | * *Data:* RedSage-Seed (\~150M Tokens) + RedSage-Dump (\~700M Tokens) |
| | 4. Stage 3: Supervised Fine-Tuning (SFT) -> [RedSage-Qwen3-8B-Ins](https://huggingface.co/RISys-Lab/RedSage-Qwen3-8B-Ins) |
| | 5. Stage 4: Direct Preference Optimization (DPO) -> [RedSage-Qwen3-8B-DPO](https://huggingface.co/RISys-Lab/RedSage-Qwen3-8B-DPO) |
| |
|
| | ## Training Data: RedSage-Seed & Dump |
| |
|
| | This model was trained on approximately **850 million tokens** of curated data, split into two collections: |
| |
|
| | 1. **RedSage-Seed (~150M Tokens):** A highly curated collection of 28,637 samples converted to structured Markdown. |
| | * **Knowledge:** General concepts and Frameworks (MITRE ATT&CK, CAPEC, CWE, OWASP). |
| | * **Skills:** Offensive security resources including write-ups, hacking techniques, and payload examples. |
| | * **Tools:** Manuals and cheat sheets for CLI tools and Kali Linux. |
| |
|
| | 2. **RedSage-Dump (~700M Tokens):** A larger aggregation of 459K technical documents. |
| | * **Sources:** Computer education portals, cybersecurity news, RFC entries, NIST publications, and the National Vulnerability Database (NVD). |
| |
|
| | ## Performance |
| |
|
| | RedSage-8B-Base achieves state-of-the-art performance among 8B models, showing significant improvements over the general-purpose Qwen3-8B-Base. It achieves the highest mean score on external benchmarks among all 8B base models tested. |
| |
|
| | ### RedSage-Bench (0-shot Accuracy) |
| |
|
| | | Category | Qwen3-8B-Base | **RedSage-8B-Base** | |
| | | :--- | :---: | :---: | |
| | | **Macro Average** | 84.24 | **85.05** | |
| | | Knowledge (General) | 83.08 | 83.12 | |
| | | Knowledge (Frameworks) | 81.94 | **84.94** | |
| | | Skill (Offensive) | 88.23 | **88.72** | |
| | | Tools (CLI) | 85.08 | **85.44** | |
| | | Tools (Kali) | 78.86 | **79.36** | |
| |
|
| | ### External Cybersecurity Benchmarks (5-shot) |
| |
|
| | | Benchmark | Qwen3-8B-Base | **RedSage-8B-Base** | |
| | | :--- | :---: | :---: | |
| | | **Mean** | 80.81 | **84.56** | |
| | | CTI-Bench (MCQ) | 68.80 | **71.04** | |
| | | CTI-Bench (RCM) | 63.50 | **78.40** | |
| | | CyberMetric (500) | 92.00 | **92.60** | |
| | | MMLU (Security) | 83.00 | **87.00** | |
| | | SecBench (En) | **82.84** | 81.76 | |
| | | SecEva (MCQ) | 75.60 | **75.83** | |
| | | SECURE (CWET) | 92.70 | **93.22** | |
| | | SECURE (KCV) | 75.05 | **87.20** | |
| | | SECURE (MEAT) | 93.81 | **94.00** | |
| |
|
| | ## Training Procedure |
| |
|
| | The model was trained using the [Axolotl](https://github.com/axolotl-ai-cloud/axolotl) framework. |
| |
|
| | - **Learning Rate:** 2.5e-6 (constant with linear warmup) |
| | - **Optimizer:** AdamW |
| | - **Epochs:** 1 |
| |
|
| | ## Usage |
| |
|
| | ```python |
| | from transformers import AutoTokenizer, AutoModelForCausalLM |
| | |
| | model_id = "RISys-Lab/RedSage-Qwen3-8B-Base" |
| | |
| | tokenizer = AutoTokenizer.from_pretrained(model_id) |
| | model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto") |
| | |
| | text = "The primary difference between a firewall and an IDS is" |
| | inputs = tokenizer(text, return_tensors="pt").to("cuda") |
| | |
| | outputs = model.generate(**inputs, max_new_tokens=50) |
| | print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| | ``` |
| |
|
| | ## Citation |
| |
|
| | If you use this model or dataset, please cite our paper: |
| |
|
| | ``` |
| | @inproceedings{suryanto2026redsage, |
| | title={RedSage: A Cybersecurity Generalist {LLM}}, |
| | author={Naufal Suryanto and Muzammal Naseer and Pengfei Li and Syed Talal Wasim and Jinhui Yi and Juergen Gall and Paolo Ceravolo and Ernesto Damiani}, |
| | booktitle={The Fourteenth International Conference on Learning Representations}, |
| | year={2026}, |
| | url={https://openreview.net/forum?id=W4FAenIrQ2} |
| | } |
| | ``` |
| |
|
| |
|