attack-vector
/

SecureModernBERT-NER

Token Classification

threat-intelligence

Model card Files Files and versions

Metrics Training metrics Community

juanmcristobal commited on Nov 4, 2025

Commit

f0e8111

·

verified ·

1 Parent(s): 7577dd8

Update README.md

Files changed (1) hide show

README.md +10 -6

README.md CHANGED Viewed

@@ -7,8 +7,12 @@ tags:
 - token-classification
 - cybersecurity
 - threat-intelligence
-datasets:
-- juanmcristobal/secureModernBert2
 ---
 # SecureModernBERT-NER
@@ -53,7 +57,7 @@ Sample output:
 ## Training Data
 - **Size:** 502,726 labelled text spans before filtering; 22 distinct entity classes in BIO format.
-- **Label distribution (spans):** `ORG` (~198k), `PRODUCT` (~79k), `MALWARE` (~67k), `PLATFORM` (~57k), `THREAT-ACTOR` (~49k), `SERVICE` (~46k), `CVE` (~41k), `LOC` (~38k), `SECTOR` (~34k), `TOOL` (~29k), plus indicator types such as `URL`, `IPV4`, `SHA256`, `MD5`, and `REGISTRY-KEYS`.
 - **Pre-processing:** JSONL articles were tokenised and converted to BIO tags; spans in conflict were resolved manually and via automated heuristics before upload.
 ## Label Mapping
@@ -218,14 +222,14 @@ If you find this model useful, please cite the repository and the base model:
 ```
 @software{securemodernbert_ner_2025,
-  author = {Juan M. Cristobal},
   title = {SecureModernBERT-NER: Cyber Threat Intelligence Named Entity Recogniser},
   year = {2025},
   publisher = {Hugging Face},
-  url = {https://huggingface.co/juanmcristobal/autotrain-sec4}
 }
 ```
 ## Contact
-Questions or feedback? Open an issue on the Hugging Face model repository or reach out at [`@juanmcristobal`](https://huggingface.co/juanmcristobal).

 - token-classification
 - cybersecurity
 - threat-intelligence
+- secureBert
+license: mit
+metrics:
+- accuracy
+base_model:
+- answerdotai/ModernBERT-large
 ---
 # SecureModernBERT-NER
 ## Training Data
 - **Size:** 502,726 labelled text spans before filtering; 22 distinct entity classes in BIO format.
+- **Label distribution (spans):** `ORG` (approx. 198k), `PRODUCT` (approx. 79k), `MALWARE` (approx. 67k), `PLATFORM` (approx. 57k), `THREAT-ACTOR` (approx. 49k), `SERVICE` (approx. 46k), `CVE` (approx. 41k), `LOC` (approx. 38k), `SECTOR` (approx. 34k), `TOOL` (approx. 29k), plus indicator types such as `URL`, `IPV4`, `SHA256`, `MD5`, and `REGISTRY-KEYS`.
 - **Pre-processing:** JSONL articles were tokenised and converted to BIO tags; spans in conflict were resolved manually and via automated heuristics before upload.
 ## Label Mapping
 ```
 @software{securemodernbert_ner_2025,
+  author = {Juan Manuel Cristóbal Moreno},
   title = {SecureModernBERT-NER: Cyber Threat Intelligence Named Entity Recogniser},
   year = {2025},
   publisher = {Hugging Face},
+  url = {https://huggingface.co/attack-vector/SecureModernBERT-NER}
 }
 ```
 ## Contact
+Questions or feedback? Open an issue on the Hugging Face model repository or reach out at [`@juanmcristobal`](https://huggingface.co/juanmcristobal).