| license: mit | |
| tags: | |
| - vulnerability-detection | |
| - linux-kernel | |
| - security | |
| - pytorch | |
| datasets: | |
| - pebblebed/kernel-vuln-dataset | |
| metrics: | |
| - auc | |
| - recall | |
| - precision | |
| pipeline_tag: text-classification | |
| # VulnBERT v8 | |
| Vulnerability detection model for Linux kernel commits. | |
| ## Results | |
| | Metric | Value | | |
| |--------|-------| | |
| | AUC | 0.987 | | |
| | Recall | 91.4% | | |
| | Precision | 88.4% | | |
| | F1 | 0.899 | | |
| | FPR | 5.9% | | |
| ## Usage | |
| ```python | |
| import torch | |
| checkpoint = torch.load("pytorch_model.pt", map_location="cpu") | |
| model.load_state_dict(checkpoint["model_state_dict"]) | |
| ``` | |
| Full code: [github.com/quguanni/vulnbert](https://github.com/quguanni/vulnbert) | |
| ## Training | |
| - Dataset: [pebblebed/kernel-vuln-dataset](https://huggingface.co/datasets/pebblebed/kernel-vuln-dataset) (650K commits) | |
| - Architecture: CodeBERT + 118 handcrafted features | |
| - Time: ~7 hours on NVIDIA GH200 | |
| ## License | |
| MIT | |