youyaoching commited on
Commit
4e52258
·
verified ·
1 Parent(s): 40c0980

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +74 -0
README.md ADDED
@@ -0,0 +1,74 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - trendmicro-ailab/Primus-FineWeb
5
+ language:
6
+ - en
7
+ base_model:
8
+ - meta-llama/Llama-3.1-8B-Instruct
9
+ pipeline_tag: text-generation
10
+ library_name: transformers
11
+ tags:
12
+ - cybersecurity
13
+ - pretraining
14
+ extra_gated_fields:
15
+ Affiliation: text
16
+ Country: country
17
+ I want to use this model for:
18
+ type: select
19
+ options:
20
+ - Research
21
+ - Commercial
22
+ - label: Other
23
+ value: other
24
+ Job title:
25
+ type: select
26
+ options:
27
+ - Student
28
+ - Research graduate
29
+ - AI researcher
30
+ - AI developer/engineer
31
+ - Cybersecurity researcher
32
+ - Reporter
33
+ - Other
34
+ geo: ip_location
35
+ ---
36
+
37
+ # Primus: A Pioneering Collection of Open-Source Datasets for Cybersecurity LLM Training
38
+
39
+ <img src="https://i.imgur.com/PtqeTZw.png" alt="Primus Overview" width="60%">
40
+
41
+ >TL;DR: Llama-Primus-Reasoning is a reasoning model distilled from the reasoning steps with reflection data generated by o1-preview on cybersecurity tasks (_Primus-Reasoning_). It demonstrates a 🚀**10%** improvement in security certification (CISSP).
42
+
43
+ 🔥 For more details, please refer to the paper: [[📄Paper]](https://arxiv.org/abs/2502.11191).**
44
+
45
+ ## Introduction
46
+
47
+ Large Language Models (LLMs) have demonstrated remarkable versatility in recent years, with promising applications in specialized domains such as finance, law, and biomedicine. However, in the domain of cybersecurity, we noticed a lack of open-source datasets specifically designed for LLM pre-training—even though much research has shown that LLMs acquire their knowledge during pre-training. To fill this gap, we present a collection of datasets covering multiple stages of cybersecurity LLM training, including pre-training (_Primus-Seed_ and _Primus-FineWeb_), instruction fine-tuning (_Primus-Instruct_), and reasoning data for distillation (_Primus-Reasoning_). Based on these datasets and Llama-3.1-8B-Instruct, we developed _Llama-Primus-Base_, _Llama-Primus-Merged_, and _Llama-Primus-Reasoning_. This model card is **Llama-Primus-Reasoning**.
48
+
49
+ > **Note:** No TrendMicro customer information is included.
50
+
51
+ ## Cybersecurity Benchmark Results
52
+
53
+
54
+ | Model | CISSP | Avg. Tokens |
55
+ |--------------------------------------|----------------------|-------------|
56
+ | **w/o CoT, 5-shot** | | |
57
+ | Llama-3.1-8B-Instruct | 0.7073 | 1 |
58
+ | Llama-Primus-Merged | 0.7191 ↑1.67% | 1 |
59
+ | **w/ CoT, 0-shot** | | |
60
+ | Llama-3.1-8B-Instruct | 0.7288 ↑3.03% | 279.69 |
61
+ | DeepSeek-R1-Distill-Llama-8B | 0.7399 ↑4.61% | 1542.10 |
62
+ | Llama-Primus-Merged | 0.7603 ↑7.49% | 241.92 |
63
+ | **Finetuned on Primus-Reasoning** | | |
64
+ | Llama-3.1-8B-Reasoning | 0.7583 ↑7.21% | 646.94 |
65
+ | Llama-Primus-Reasoning | 0.7780 ↑**10.0%** | 726.96 |
66
+ | --- | | |
67
+ | o1-preview | 0.8035 | 1054.91 |
68
+
69
+
70
+
71
+ Effect of _Primus-Reasoning_ fine-tuning, evaluated on CISSP. ↑ indicates the percentage improvement over Llama without CoT and in the 5-shot setting. The best improvement is highlighted in **bold**.
72
+
73
+ ## License
74
+ This model is based on the MIT license, but you must also comply with the Llama 3.1 Community License Agreement.