CyberScale Contextual Severity v1
NIS2 context-aware vulnerability severity classifier. Takes a CVE description, deployment sector, cross-border status, and CVSS score โ produces a contextual severity (Critical/High/Medium/Low) that accounts for regulatory sector impact.
Model Description
- Architecture: ModernBERT-base with 4-class classification head
- Training: Mixed synthetic rules + 1,850 human-curated predecessor scenarios (30% weight)
- Confidence: Monte Carlo dropout (20 passes) maps variance to high/medium/low
- Sectors: 19 NIS2 sectors (18 regulated + non-NIS2)
Intended Use
Assess contextual severity of vulnerabilities in specific deployment contexts. A Critical CVSS vulnerability in a non-regulated small business may be Medium contextually, while a Medium CVSS vulnerability in cross-border healthcare infrastructure may be High.
Input format: <description> [SEP] sector: <sector_id> cross_border: <true|false> score: <cvss_score>
Training Data
- Synthetic: 32,000 scenarios from deterministic NIS2 escalation rules (CVEs x sectors x cross-border)
- Predecessor: 1,850 human-curated scenarios from CVE-Severity-Context project (7x oversampled, 30% weight)
- Balance: 8,000 per severity class after balancing
Metrics
Test set (synthetic + predecessor mix)
| Metric | Value |
|---|---|
| Accuracy | 0.8171 |
| Macro F1 | 0.8148 |
| All 19 sectors | > 71% |
Predecessor benchmark (1,833 human-curated scenarios)
| Metric | Value |
|---|---|
| Accuracy | 88.0% |
| Delta vs Variant F (80.7%) | +7.3pp |
| NIS2 sectors | > 94% |
| Non-NIS2 | 65.3% |
Valid Sectors
banking, chemicals, digital_infrastructure, digital_providers, drinking_water, energy, financial_market, food, health, ict_service_management, manufacturing, non_nis2, postal, public_administration, research, space, transport, waste_management, waste_water
Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model = AutoModelForSequenceClassification.from_pretrained("eromang/cyberscale-contextual-v1", num_labels=4)
tokenizer = AutoTokenizer.from_pretrained("eromang/cyberscale-contextual-v1")
text = "SQL injection in login form [SEP] sector: health cross_border: true score: 7.5"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=192)
with torch.no_grad():
logits = model(**inputs).logits
probs = torch.softmax(logits, dim=-1)
label = ["Low", "Medium", "High", "Critical"][probs.argmax().item()]
print(f"Contextual severity: {label}")
Limitations
- Small-deployment / non-NIS2 scenarios are the weakest (51% accuracy)
- Trained on English descriptions only
- Does not capture sub-sector deployment context (e.g., clinical vs billing system in healthcare)
Citation
Part of the CyberScale project โ multi-phase cyber severity assessment MCP server.
- Downloads last month
- 25
Evaluation results
- Accuracyself-reported0.817
- Macro F1self-reported0.815
- Predecessor Benchmarkself-reported0.880