File size: 7,821 Bytes
0dc7548 b1ccd1f 0dc7548 700d718 0dc7548 700d718 c5f31ed 792db5b c5f31ed 792db5b c5f31ed 700d718 299e961 700d718 299e961 700d718 299e961 700d718 299e961 c49c2df 700d718 792db5b 700d718 299e961 700d718 299e961 700d718 299e961 700d718 299e961 700d718 299e961 700d718 299e961 700d718 c5f31ed 700d718 c5f31ed 700d718 c5f31ed 700d718 c5f31ed 700d718 c5f31ed 700d718 c5f31ed 700d718 c5f31ed 700d718 c5f31ed 792db5b 700d718 c5f31ed 6302dd5 700d718 6302dd5 700d718 6302dd5 700d718 c5f31ed 700d718 c49c2df 700d718 c5f31ed 14f7fc0 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 | ---
license: apache-2.0
language:
- en
base_model:
- cisco-ai/SecureBERT2.0-base
pipeline_tag: token-classification
library_name: transformers
tags:
- NER
- SecureBERT2
- CyberNER
- token-classification
- cybersecurity
---
# Model Card for cisco-ai/SecureBERT2.0-NER
The **Secure Modern BERT NER Model** is a fine-tuned transformer based on [**SecureBERT 2.0**](https://huggingface.co/cisco-ai/SecureBERT2.0-base), designed for **Named Entity Recognition (NER)** in cybersecurity text.
It extracts domain-specific entities such as **Indicators, Malware, Organizations, Systems, and Vulnerabilities** from unstructured data sources like threat reports, incident analyses, advisories, and blogs.
NER in cybersecurity enables:
- Automated extraction of indicators of compromise (IOCs)
- Structuring of unstructured threat intelligence text
- Improved situational awareness for analysts
- Faster incident response and vulnerability triage
---
## Model Details
### Model Description
- **Developed by:** Cisco AI
- **Model Type:** ModernBertForTokenClassification
- **Framework:** TensorFlow / Transformers
- **Tokenizer Type:** PreTrainedTokenizerFast
- **Number of Labels:** 11
- **Task:** Named Entity Recognition (NER)
- **License:** Apache-2.0
- **Language:** English
- **Base Model:** [cisco-ai/SecureBERT2.0](https://huggingface.co/cisco-ai/SecureBERT2.0-base)
#### Supported Entity Labels
| Entity | Description |
|:--------|:-------------|
| `B-Indicator`, `I-Indicator` | Indicators of Compromise (e.g., IPs, domains, hashes) |
| `B-Malware`, `I-Malware` | Malware or exploit names |
| `B-Organization`, `I-Organization` | Companies or groups mentioned |
| `B-System`, `I-System` | Affected software or platforms |
| `B-Vulnerability`, `I-Vulnerability` | Specific CVEs or flaw descriptions |
| `O` | Outside token |
#### Model Configuration
| Parameter | Value |
|:-----------|:-------|
| Hidden size | 768 |
| Intermediate size | 1152 |
| Hidden layers | 22 |
| Attention heads | 12 |
| Max sequence length | 8192 |
| Vocabulary size | 50368 |
| Activation | GELU |
| Dropout | 0.0 (embedding, attention, MLP, classifier) |
---
## Uses
### Direct Use
- Named Entity Recognition (NER) on cybersecurity text
- Threat intelligence enrichment
- IOC extraction and normalization
- Incident report analysis
- Vulnerability mention detection
### Downstream Use
This model can be integrated into:
- Threat intelligence platforms (TIPs)
- SOC automation tools
- Cybersecurity knowledge graphs
- Vulnerability management and CVE monitoring systems
### Out-of-Scope Use
- Non-technical or general-domain NER tasks
- Generative or conversational AI applications
---
## Benchmark Cybersecurity NER Corpus
### Dataset Overview
| Aspect | Description |
|:-------|:-------------|
| **Purpose** | Benchmark dataset for extracting cybersecurity entities from unstructured reports |
| **Data Source** | Curated threat intelligence documents emphasizing malware and system analysis |
| **Annotation Methodology** | Fully hand-labeled by domain experts |
| **Entity Types** | Malware, Indicator, System, Organization, Vulnerability |
| **Size** | 3.4k training samples + 717 test samples |
---
## How to Get Started with the Model
### Example Usage (Transformers)
```python
from transformers import AutoTokenizer, TFAutoModelForTokenClassification, pipeline
model_name = "cisco-ai/SecureBERT2.0-NER"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = TFAutoModelForTokenClassification.from_pretrained(model_name)
ner_pipeline = pipeline("ner", model=model, tokenizer=tokenizer)
text = "Stealc malware targets browser cookies and passwords."
entities = ner_pipeline(text)
print(entities)
```
## Training Details
### Training Objective and Procedure
The `SecureBERT2.0-NER` was fine-tuned for **token-level classification** on cybersecurity text using **Cross Entropy Loss**.
Training focused on accurately classifying entity boundaries and types across five cybersecurity-specific categories: *Malware, Indicator, System, Organization,* and *Vulnerability*.
The **AdamW** optimizer was used with a **linear learning rate scheduler**, and gradient clipping ensured stability during fine-tuning.
### Training Configuration
| Setting | Value |
|:---------|:------:|
| Objective | Token-wise Cross Entropy |
| Optimizer | AdamW |
| Learning Rate | 1e-5 |
| Weight Decay | 0.001 |
| Batch Size per GPU | 8 |
| Epochs | 20 |
| Max Sequence Length | 1024 |
| Gradient Clipping Norm | 1.0 |
| Scheduler | Linear |
| Mixed Precision | fp16 |
| Framework | TensorFlow / Transformers |
### Training Dataset
The model was fine-tuned on a **cybersecurity-specific NER corpus**, containing annotated threat intelligence reports, advisories, and technical documentation.
| Property | Description |
|:----------|:-------------|
| **Dataset Type** | Manually annotated corpus |
| **Language** | English |
| **Entity Types** | Malware, Indicator, System, Organization, Vulnerability |
| **Train Size** | 3,400 samples |
| **Test Size** | 717 samples |
| **Annotation Method** | Expert hand-labeling for accuracy and consistency |
### Preprocessing
- Texts were tokenized using the `PreTrainedTokenizerFast` tokenizer from SecureBERT 2.0.
- All sequences were truncated or padded to 1024 tokens.
- Labels were aligned with subword tokens to maintain token–label consistency.
### Hardware and Training Setup
| Component | Description |
|:-----------|:-------------|
| GPUs Used | 8× NVIDIA A100 |
| Precision | Mixed precision (fp16) |
| Batch Size | 8 per GPU |
| Framework | Transformers (TensorFlow backend) |
### Optimization Summary
The model converged after approximately **20 epochs**, with loss stabilizing at a low level.
Validation metrics (F1, precision, recall) showed steady improvement from epoch 3 onward, confirming effective domain-specific adaptation.
## Evaluation
### Testing Data, Factors & Metrics
#### Testing Data
Evaluation was conducted on a **cybersecurity-specific NER benchmark corpus** containing annotated threat reports, advisories, and incident analysis texts.
This benchmark includes five key entity types: **Malware, Indicator, System, Organization, and Vulnerability**.
#### Metrics
The following metrics were used to assess model performance:
- **F1-score:** Harmonic mean of precision and recall
- **Recall:** Measures how many true entities were correctly identified
- **Precision:** Measures how many predicted entities were correct
### Results
| Model | F1 | Recall | Precision |
|:------|:---:|:-------:|:-----------:|
| **CyBERT** | 0.351 | 0.281 | 0.467 |
| **SecureBERT** | 0.734 | 0.759 | 0.717 |
| **SecureBERT 2.0 (Ours)** | **0.945** | **0.965** | **0.927** |
#### Summary
The **SecureBERT 2.0 NER model** significantly outperforms both CyBERT and the original SecureBERT across all metrics.
- It achieves a **F1-score of 0.945**, a **+21% absolute improvement** over SecureBERT.
- Its **recall (0.965)** indicates excellent coverage of cybersecurity entities.
- Its **precision (0.927)** shows strong accuracy and low false-positive rates.
This demonstrates that **domain-adaptive pretraining and fine-tuning** on cybersecurity corpora dramatically improves NER performance compared to general or earlier models.
---
## Reference
```
@article{aghaei2025securebert,
title={SecureBERT 2.0: Advanced Language Model for Cybersecurity Intelligence},
author={Aghaei, Ehsan and Jain, Sarthak and Arun, Prashanth and Sambamoorthy, Arjun},
journal={arXiv preprint arXiv:2510.00240},
year={2025}
}
```
---
## Model Card Authors
Cisco AI
## Model Card Contact
For inquiries, please contact [ai-threat-intel@cisco.com](mailto:ai-threat-intel@cisco.com) |