Upload README.md with huggingface_hub

60fcb85 verified 18 days ago

4.04 kB

	---
	license: mit
	language:
	- en
	tags:
	- cybersecurity
	- vulnerability
	- mitre-attck
	- text-classification
	- fine-tuned
	base_model: ehsanaghaei/SecureBERT
	---

	# SecureBERT — MITRE ATT&CK Classifier

	[![PhD theses.fr](https://img.shields.io/badge/Project-theses.fr-orange?logo=university&logoColor=white)](https://theses.fr/s371241)
	[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
	[![Zenodo Data](https://img.shields.io/badge/Zenodo-Data%20Repository-lightblue?logo=information&logoColor=white)](https://doi.org/10.5281/zenodo.16936476)
	[![Zenodo Code](https://img.shields.io/badge/Zenodo-Code%20Repository-blue?logo=information&logoColor=white)](https://zenodo.org/records/17368476)
	[![GitHub](https://img.shields.io/badge/GitHub-CVE--LMTune-black?logo=github)](https://github.com/terranovafr/CVE-LMTune)


	<div align="center">
	<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/5/5b/Logo_Universit%C3%A9_de_Lorraine.svg/1280px-Logo_Universit%C3%A9_de_Lorraine.svg.png" alt="Universite de Lorraine" height="50"/>

	<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/9/95/Inr_logo_rouge.svg/1280px-Inr_logo_rouge.svg.png" alt="INRIA" height="50"/>

	<img src="https://upload.wikimedia.org/wikipedia/fr/6/6e/Logo_loria_abrege_couleur.png" alt="LORIA" height="70"/>

	<img src="https://www.pepr-cybersecurite.fr/wp-content/uploads/2023/09/pep-cybersecurite-550x250-1.png" alt="SuperViZ" height="70"/>
	</div>
	<br>

	Part of the CVE-LMTune model suite — language models fine-tuned for multi-taxonomy vulnerability classification.

	## Paper

	> Franco Terranova, Sana Rekbi, Abdelkader Lahmadi, Isabelle Chrisment.
	> Multi-Taxonomy Vulnerability Classification with Hierarchically Finetuned Language Models.
	> The 23rd Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA '26).

	## Task

	MITRE ATT&CK technique classification from CVE descriptions

	## Performance

	See paper for details

	## Model Structure

	flat — standard `AutoModelForSequenceClassification`

	## Quick Start

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	tokenizer = AutoTokenizer.from_pretrained("Sana9/securebert-mitre-attack")
	model = AutoModelForSequenceClassification.from_pretrained("Sana9/securebert-mitre-attack")
	model.eval()

	text = "Buffer overflow vulnerability in OpenSSL allows remote attackers to execute arbitrary code."
	inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

	with torch.no_grad():
	logits = model(**inputs).logits
	probs = torch.sigmoid(logits) # multi-label → sigmoid
	```

	> Note for hierarchical models: This repo contains multiple sub-folders (master + slave models).
	> Load each sub-folder separately using `from_pretrained("Sana9/securebert-mitre-attack/master")` etc.

	## Citation

	```bibtex
	@inproceedings{terranova2026cvelmtune,
	title = {Multi-Taxonomy Vulnerability Classification with Hierarchically Finetuned Language Models},
	author = {Terranova, Franco and Rekbi, Sana and Lahmadi, Abdelkader and Chrisment, Isabelle},
	booktitle = {Proceedings of DIMVA '26},
	year = {2026}
	}
	```

	## Related Resources

	- 🤗 [Full model suite on Hugging Face](https://huggingface.co/Sana9)
	- 💻 [CVE-LMTune — Training code (GitHub)](https://github.com/terranovafr/CVE-LMTune)
	- 📦 [Zenodo — Data repository](https://doi.org/10.5281/zenodo.16936476)
	- 📦 [Zenodo — Code repository](https://zenodo.org/records/17368476)


	## Disclaimers

	- This product uses the NVD API but is not endorsed or certified by the NVD.
	- This project relies on data publicly available from the CWE, CAPEC, and MITRE ATT&CK projects.
	- This work has been partially supported by the French National Research Agency under the France 2030 label (Superviz ANR-22-PECY-0008). The views reflected herein do not necessarily reflect the opinion of the French government.