| --- |
| license: mit |
| language: |
| - en |
| tags: |
| - cybersecurity |
| - vulnerability |
| - mitre-attack |
| - text-classification |
| - fine-tuned |
| - securebert |
| base_model: ehsanaghaei/SecureBERT |
| --- |
| |
| # SecureBERT β CVE-LMTune ATT&CK Classifier (Flat) |
|
|
| <div align="center" style="display:inline-flex; gap:18px; align-items:center; flex-wrap:nowrap;"> <img src="https://upload.wikimedia.org/wikipedia/commons/thumb/5/5b/Logo_Universit%C3%A9_de_Lorraine.svg/1280px-Logo_Universit%C3%A9_de_Lorraine.svg.png" alt="Universite de Lorraine" style="height:50px; width:auto;" /> <img src="https://upload.wikimedia.org/wikipedia/commons/thumb/9/95/Inr_logo_rouge.svg/1280px-Inr_logo_rouge.svg.png" alt="INRIA" style="height:50px; width:auto;" /> <img src="https://upload.wikimedia.org/wikipedia/fr/6/6e/Logo_loria_abrege_couleur.png" alt="LORIA" style="height:70px; width:auto;" /> <img src="https://www.pepr-cybersecurite.fr/wp-content/uploads/2023/09/pep-cybersecurite-550x250-1.png" alt="SuperViZ" style="height:70px; width:auto;" /> </div> |
|
|
| [](https://github.com/terranovafr/CVE-LMTune) |
| [](https://hal.science/hal-05500820) |
| [](https://theses.fr/s371241) |
| [](https://opensource.org/licenses/MIT) |
| [](https://doi.org/10.5281/zenodo.16936476) |
|
|
| Part of the **CVE-LMTune** model suite, a collection of language models fine-tuned for multi-taxonomy vulnerability classification across widely used cybersecurity taxonomies, including CWE, CAPEC, and MITRE ATT&CK. |
|
|
| ## Paper |
|
|
| > Franco Terranova, Sana Rekbi, Abdelkader Lahmadi, Isabelle Chrisment. |
| > *Multi-Taxonomy Vulnerability Classification with Hierarchically Finetuned Language Models.* |
| > The 23rd Conference on Detection of Intrusions and Malware & Vulnerability Assessment **(DIMVA '26)**. |
|
|
| ## Overview |
|
|
| This model performs **multi-label ATT&CK classification** from vulnerability descriptions. Given a CVE-style description, it predicts one or more ATT&CK identifiers associated with the described vulnerability. |
|
|
| | Property | Value | |
| |----------|-------| |
| | Taxonomy | MITRE ATT&CK Enterprise Subtechniques | |
| | Task | Multi-label text classification | |
| | Input | Vulnerability description (e.g., CVE summary) | |
| | Output | One or more ATT&CK identifiers | |
| | Number of labels | 175 | |
| | Number of samples | 231,009 | |
| | Latest CVE update included | 17/06/2026 | |
| | Split | train (60%), val (20%), test (20%) | |
|
|
| ## Evaluation Results |
|
|
| The model was evaluated on the held-out test set using standard multi-label classification metrics using sigmoid activation and a default threshold of 0.5. |
|
|
| **Ranking Metrics** |
| | LRAP | MRR | Coverage Error | Label Ranking Loss | P@1 | P@3 | P@5 | R@1 | R@3 | R@5 | |
| |------|-----|----------------|--------------------|-----|-----|-----|-----|-----|-----| |
| | 0.9152 | 0.9460 | 18.79 | 0.0173 | 0.9321 | 0.9084 | 0.8458 | 0.1286 | 0.3779 | 0.5554 | |
|
|
| **Threshold = 0.5** |
| | Micro P | Micro R | Micro F1 | Macro F1 | Weighted F1 | Hamming Loss | Subset Accuracy | |
| |--------|--------|----------|----------|------------|--------------|----------------| |
| | 0.8612 | 0.7767 | 0.8168 | 0.4286 | 0.8093 | 0.0264 | 0.6874 | |
|
|
| ## Quick Start |
|
|
| ```python |
| from transformers import AutoTokenizer, AutoModelForSequenceClassification |
| import torch |
| |
| tokenizer = AutoTokenizer.from_pretrained("Sana9/securebert-vuln2attack-flat", use_fast=False) |
| model = AutoModelForSequenceClassification.from_pretrained("Sana9/securebert-vuln2attack-flat") |
| |
| text = "Buffer overflow vulnerability in OpenSSL allows remote attackers to execute arbitrary code." |
| |
| with torch.no_grad(): |
| probs = torch.sigmoid( |
| model(**tokenizer(text, return_tensors="pt", truncation=True)).logits |
| )[0] |
| |
| predictions = { |
| model.config.id2label[i]: p.item() |
| for i, p in enumerate(probs) |
| if p > 0.5 |
| } |
| |
| print(predictions) |
| ``` |
|
|
| ## Citation |
|
|
| ```bibtex |
| @inproceedings{terranova2026multitaxonomy, |
| author = {Franco Terranova and Sana Rekbi and Abdelkader Lahmadi and Isabelle Chrisment}, |
| title = {Multi-Taxonomy Vulnerability Classification with Hierarchically Finetuned Language Models}, |
| booktitle = {Proceedings of the International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA)}, |
| year = {2026}, |
| month = jul, |
| address = {Chania, Crete, Greece}, |
| note = {HAL identifier: hal-05500820v2} |
| } |
| ``` |
|
|
| ## Related Resources |
|
|
| - π€ [Full model suite on Hugging Face](https://huggingface.co/Sana9) |
| - π» [CVE-LMTune β Training code (GitHub)](https://github.com/terranovafr/CVE-LMTune) |
| - π¦ [Zenodo β Data repository](https://doi.org/10.5281/zenodo.16936476) |
|
|
| ## Disclaimers |
|
|
| - This product is a result of the use of the NVD API but is not endorsed or certified by the NVD. The same for the CVE2CAPEC project and the Hugging Face API. |
| - This project relies on data publicly available from the CWE, CAPEC, and MITRE ATT&CK projects. |
| - This work has been partially supported by the French National Research Agency under the France 2030 label (Superviz ANR-22-PECY-0008). The views reflected herein do not necessarily reflect the opinion of the French government. |
|
|