trendmicro-ailab
/

Llama-Primus-Reasoning

+---
+license: mit
+datasets:
+- trendmicro-ailab/Primus-FineWeb
+language:
+- en
+base_model:
+- meta-llama/Llama-3.1-8B-Instruct
+pipeline_tag: text-generation
+library_name: transformers
+tags:
+- cybersecurity
+- pretraining
+extra_gated_fields:
+  Affiliation: text
+  Country: country
+  I want to use this model for:
+    type: select
+    options:
+      - Research
+      - Commercial
+      - label: Other
+        value: other
+  Job title:
+    type: select
+    options:
+    - Student
+    - Research graduate
+    - AI researcher
+    - AI developer/engineer
+    - Cybersecurity researcher
+    - Reporter
+    - Other
+  geo: ip_location
+---
+# Primus: A Pioneering Collection of Open-Source Datasets for Cybersecurity LLM Training
+<img src="https://i.imgur.com/PtqeTZw.png" alt="Primus Overview" width="60%">
+>TL;DR: Llama-Primus-Reasoning is a reasoning model distilled from the reasoning steps with reflection data generated by o1-preview on cybersecurity tasks (_Primus-Reasoning_). It demonstrates a 🚀**10%** improvement in security certification (CISSP).
+🔥 For more details, please refer to the paper: [[📄Paper]](https://arxiv.org/abs/2502.11191).**
+## Introduction
+Large Language Models (LLMs) have demonstrated remarkable versatility in recent years, with promising applications in specialized domains such as finance, law, and biomedicine. However, in the domain of cybersecurity, we noticed a lack of open-source datasets specifically designed for LLM pre-training—even though much research has shown that LLMs acquire their knowledge during pre-training.  To fill this gap, we present a collection of datasets covering multiple stages of cybersecurity LLM training, including pre-training (_Primus-Seed_ and _Primus-FineWeb_), instruction fine-tuning (_Primus-Instruct_), and reasoning data for distillation (_Primus-Reasoning_).  Based on these datasets and Llama-3.1-8B-Instruct, we developed _Llama-Primus-Base_, _Llama-Primus-Merged_, and _Llama-Primus-Reasoning_. This model card is **Llama-Primus-Reasoning**.
+  >  **Note:** No TrendMicro customer information is included.
+## Cybersecurity Benchmark Results
+| Model                               | CISSP                | Avg. Tokens |
+|--------------------------------------|----------------------|-------------|
+| **w/o CoT, 5-shot**                 |                      |             |
+| Llama-3.1-8B-Instruct               | 0.7073               | 1           |
+| Llama-Primus-Merged                 | 0.7191 ↑1.67%        | 1           |
+| **w/ CoT, 0-shot**                   |                      |             |
+| Llama-3.1-8B-Instruct               | 0.7288 ↑3.03%        | 279.69      |
+| DeepSeek-R1-Distill-Llama-8B        | 0.7399 ↑4.61%        | 1542.10     |
+| Llama-Primus-Merged                 | 0.7603 ↑7.49%        | 241.92      |
+| **Finetuned on Primus-Reasoning**   |                      |             |
+| Llama-3.1-8B-Reasoning              | 0.7583 ↑7.21%        | 646.94      |
+| Llama-Primus-Reasoning              | 0.7780 ↑**10.0%**    | 726.96      |
+| ---   |                      |             |
+| o1-preview                          | 0.8035               | 1054.91     |
+Effect of _Primus-Reasoning_ fine-tuning, evaluated on CISSP. ↑ indicates the percentage improvement over Llama without CoT and in the 5-shot setting. The best improvement is highlighted in **bold**.
+## License
+This model is based on the MIT license, but you must also comply with the Llama 3.1 Community License Agreement.