SecureBERT β€” CVE-LMTune ATT&CK Classifier (Flat)

Universite de Lorraine INRIA LORIA SuperViZ

GitHub Paper PhD theses.fr License: MIT Zenodo Data

Part of the CVE-LMTune model suite, a collection of language models fine-tuned for multi-taxonomy vulnerability classification across widely used cybersecurity taxonomies, including CWE, CAPEC, and MITRE ATT&CK.

Paper

Franco Terranova, Sana Rekbi, Abdelkader Lahmadi, Isabelle Chrisment. Multi-Taxonomy Vulnerability Classification with Hierarchically Finetuned Language Models. The 23rd Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA '26).

Overview

This model performs multi-label ATT&CK classification from vulnerability descriptions. Given a CVE-style description, it predicts one or more ATT&CK identifiers associated with the described vulnerability.

Property Value
Taxonomy MITRE ATT&CK Enterprise Subtechniques
Task Multi-label text classification
Input Vulnerability description (e.g., CVE summary)
Output One or more ATT&CK identifiers
Number of labels 175
Number of samples 231,009
Latest CVE update included 17/06/2026
Split train (60%), val (20%), test (20%)

Evaluation Results

The model was evaluated on the held-out test set using standard multi-label classification metrics using sigmoid activation and a default threshold of 0.5.

Ranking Metrics

LRAP MRR Coverage Error Label Ranking Loss P@1 P@3 P@5 R@1 R@3 R@5
0.9152 0.9460 18.79 0.0173 0.9321 0.9084 0.8458 0.1286 0.3779 0.5554

Threshold = 0.5

Micro P Micro R Micro F1 Macro F1 Weighted F1 Hamming Loss Subset Accuracy
0.8612 0.7767 0.8168 0.4286 0.8093 0.0264 0.6874

Quick Start

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("Sana9/securebert-vuln2attack-flat")
model = AutoModelForSequenceClassification.from_pretrained("Sana9/securebert-vuln2attack-flat")

text = "Buffer overflow vulnerability in OpenSSL allows remote attackers to execute arbitrary code."

with torch.no_grad():
    probs = torch.sigmoid(
        model(**tokenizer(text, return_tensors="pt", truncation=True)).logits
    )[0]

predictions = {
    model.config.id2label[i]: p.item()
    for i, p in enumerate(probs)
    if p > 0.5
}

print(predictions)

Citation

@inproceedings{terranova2026multitaxonomy,
  author    = {Franco Terranova and Sana Rekbi and Abdelkader Lahmadi and Isabelle Chrisment},
  title     = {Multi-Taxonomy Vulnerability Classification with Hierarchically Finetuned Language Models},
  booktitle = {Proceedings of the International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA)},
  year      = {2026},
  month     = jul,
  address   = {Chania, Crete, Greece},
  note      = {HAL identifier: hal-05500820v2}
}

Related Resources

Disclaimers

  • This product is a result of the use of the NVD API but is not endorsed or certified by the NVD. The same for the CVE2CAPEC project and the Hugging Face API.
  • This project relies on data publicly available from the CWE, CAPEC, and MITRE ATT&CK projects.
  • This work has been partially supported by the French National Research Agency under the France 2030 label (Superviz ANR-22-PECY-0008). The views reflected herein do not necessarily reflect the opinion of the French government.
Downloads last month
5
Safetensors
Model size
0.1B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Sana9/securebert-vuln2attack-flat

Finetuned
(15)
this model