Update README.md
Browse files
README.md
CHANGED
|
@@ -15,16 +15,24 @@ base_model:
|
|
| 15 |
- answerdotai/ModernBERT-large
|
| 16 |
---
|
| 17 |
|
| 18 |
-
#
|
| 19 |
|
| 20 |
-
SecureModernBERT-NER
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
|
| 22 |
## Quick Start
|
| 23 |
|
| 24 |
```python
|
| 25 |
from transformers import pipeline
|
| 26 |
|
| 27 |
-
model_id = "
|
| 28 |
|
| 29 |
pipe = pipeline(
|
| 30 |
task="token-classification",
|
|
@@ -155,21 +163,6 @@ These metrics were computed with the `seqeval` micro-average at the entity level
|
|
| 155 |
|
| 156 |
The following tables report detailed results on a shared CTI validation set. **Do not compare the per-label values across models directly:** each checkpoint uses a different taxonomy or remapping strategy, so accuracy percentages can be misleading when labels are aligned or collapsed differently. Use the per-model tables to understand performance within a single schema, and interpret macro-accuracy scores with caution.
|
| 157 |
|
| 158 |
-
### PranavaKailash/CyNER-2.0-DeBERTa-v3-base
|
| 159 |
-
|
| 160 |
-
| Label | Used | Accuracy |
|
| 161 |
-
|-------|------|----------|
|
| 162 |
-
| Indicator | 35,936 | 0.7878 |
|
| 163 |
-
| Location | 7,895 | 0.0113 |
|
| 164 |
-
| Malware | 12,125 | 0.7800 |
|
| 165 |
-
| O | 2,896 | 0.7652 |
|
| 166 |
-
| Organization | 42,537 | 0.6556 |
|
| 167 |
-
| System | 35,063 | 0.7259 |
|
| 168 |
-
| TOOL | 4,820 | 0.0000 |
|
| 169 |
-
| Threat Group | 9,522 | 0.0000 |
|
| 170 |
-
| Vulnerability | 27,673 | 0.1876 |
|
| 171 |
-
|
| 172 |
-
- **Macro accuracy:** 0.4348
|
| 173 |
|
| 174 |
### CyberPeace-Institute/SecureBERT-NER
|
| 175 |
|
|
@@ -193,7 +186,23 @@ The following tables report detailed results on a shared CTI validation set. **D
|
|
| 193 |
| URL | 6,997 | 0.0795 |
|
| 194 |
| VULID | 27,586 | 0.3849 |
|
| 195 |
|
| 196 |
-
- **Macro accuracy:**
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 197 |
|
| 198 |
### cisco-ai/SecureBERT2.0-NER
|
| 199 |
|
|
|
|
| 15 |
- answerdotai/ModernBERT-large
|
| 16 |
---
|
| 17 |
|
| 18 |
+
# Model Overview
|
| 19 |
|
| 20 |
+
**SecureModernBERT-NER** represents a new generation of cybersecurity-focused language models — combining the **state-of-the-art architecture of ModernBERT** with one of the **largest and most diverse CTI-labelled NER corpora ever built**.
|
| 21 |
+
|
| 22 |
+
Unlike conventional NER systems, SecureModernBERT-NER recognises **22 finely-grained, security-specific entity types**, covering the full spectrum of cyber-threat intelligence — from `THREAT-ACTOR` and `MALWARE` to `CVE`, `IPV4`, `DOMAIN`, and `REGISTRY-KEYS`.
|
| 23 |
+
|
| 24 |
+
Trained on more than **half a million manually curated spans** sourced from real-world threat reports, vulnerability advisories, and incident analyses, it achieves an exceptional balance of **accuracy, generalisation, and contextual depth**.
|
| 25 |
+
|
| 26 |
+
This model is designed to **parse complex security narratives with human-level precision**, extracting both contextual metadata (e.g., `ORG`, `PRODUCT`, `PLATFORM`) and highly technical indicators (e.g., `HASHES`, `URLS`, `NETWORK ADDRESSES`) — all within a single unified framework.
|
| 27 |
+
|
| 28 |
+
SecureModernBERT-NER sets a new standard for **automated CTI entity recognition**, enabling the next wave of **threat-intelligence automation, enrichment, and analytics**.
|
| 29 |
|
| 30 |
## Quick Start
|
| 31 |
|
| 32 |
```python
|
| 33 |
from transformers import pipeline
|
| 34 |
|
| 35 |
+
model_id = "attack-vector/SecureModernBERT-NER"
|
| 36 |
|
| 37 |
pipe = pipeline(
|
| 38 |
task="token-classification",
|
|
|
|
| 163 |
|
| 164 |
The following tables report detailed results on a shared CTI validation set. **Do not compare the per-label values across models directly:** each checkpoint uses a different taxonomy or remapping strategy, so accuracy percentages can be misleading when labels are aligned or collapsed differently. Use the per-model tables to understand performance within a single schema, and interpret macro-accuracy scores with caution.
|
| 165 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 166 |
|
| 167 |
### CyberPeace-Institute/SecureBERT-NER
|
| 168 |
|
|
|
|
| 186 |
| URL | 6,997 | 0.0795 |
|
| 187 |
| VULID | 27,586 | 0.3849 |
|
| 188 |
|
| 189 |
+
- **Macro accuracy:** 0.3820
|
| 190 |
+
|
| 191 |
+
### PranavaKailash/CyNER-2.0-DeBERTa-v3-base
|
| 192 |
+
|
| 193 |
+
| Label | Used | Accuracy |
|
| 194 |
+
|-------|------|----------|
|
| 195 |
+
| Indicator | 35,936 | 0.7878 |
|
| 196 |
+
| Location | 7,895 | 0.0113 |
|
| 197 |
+
| Malware | 12,125 | 0.7800 |
|
| 198 |
+
| O | 2,896 | 0.7652 |
|
| 199 |
+
| Organization | 42,537 | 0.6556 |
|
| 200 |
+
| System | 35,063 | 0.7259 |
|
| 201 |
+
| TOOL | 4,820 | 0.0000 |
|
| 202 |
+
| Threat Group | 9,522 | 0.0000 |
|
| 203 |
+
| Vulnerability | 27,673 | 0.1876 |
|
| 204 |
+
|
| 205 |
+
- **Macro accuracy:** 0.4348
|
| 206 |
|
| 207 |
### cisco-ai/SecureBERT2.0-NER
|
| 208 |
|