YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
NER Cybersecurity Model v2
Extract security skills, certs, and threats from CVs and job posts in milliseconds.
Model Details
- Base Model: en_core_web_sm (spaCy 2.2.5)
- Training: Prodigy 1.9.9 with 1805 annotated examples
- Accuracy: 99.5% on evaluation set
- Version: 2.0 (improved coverage)
Improvements over v1
- +24.5% F1 (27.5% โ 52.0% on test set)
- CVE recognition now working (80% F1, was 0%)
- ACRONYM recognition added (66.7% F1, was 0%)
- CISO correctly tagged as SECURITY_ROLE (was CERTIFICATION)
- 805 new training examples covering previously missing types
Entity Types (13)
| Category | Entity Types |
|---|---|
| Core Security | SECURITY_ROLE, SECURITY_TOOL, CERTIFICATION, FRAMEWORK |
| Threats | CVE, THREAT_TYPE, ATTACK_TECHNIQUE |
| Skills | TECHNICAL_SKILL, ACRONYM, SECURITY_DOMAIN |
| Compliance | REGULATION, CONTROL_ID, AUDIT_TERM |
Usage
import spacy
nlp = spacy.load("pki/ner-cybersecurity")
doc = nlp("CISO with CISSP. Patched CVE-2024-1234. Expert in SIEM and EDR.")
for ent in doc.ents:
print(f"{ent.label_}: {ent.text}")
Output:
SECURITY_ROLE: CISO
CERTIFICATION: CISSP
CVE: CVE-2024-1234
SECURITY_TOOL: SIEM
ACRONYM: EDR
Training Data Sources
- pki-ad-match role YAMLs (62 roles)
- agent-nexus-85 agent presets (90 agents)
- Synthetic examples with cybersecurity entities
- 805 new examples for underrepresented types
Requirements
- spaCy 2.2.x
- Python 3.6-3.8
License
MIT
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support