File size: 6,247 Bytes
70ae555
 
 
 
 
 
 
 
 
 
 
64e79b4
70ae555
 
 
64e79b4
70ae555
 
 
 
 
 
 
 
 
 
 
 
64e79b4
70ae555
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
---
license: apache-2.0
language:
- en
base_model:
- microsoft/codebert-base
pipeline_tag: text-classification
library_name: transformers
tags:
- code
---
# Model Card for vuteco-cb-e2e

<!-- Provide a quick summary of what the model is/does. -->

`vuteco-cb-e2e` is a fine-tuned [CodeBERT](https://huggingface.co/microsoft/codebert-base) that classifies pairs of JUnit test methods and vulnerability descriptions (from CVE) into two classes:
- `Related` if it the method is testing the vulnerability described.
- `NotRelated` if it the method is not testing the vulnerability described.

## Model Details

### Model Description

<!-- Provide a longer summary of what this model is. -->

VuTeCo is a framework for finding vulnerability-witnessing test cases in Java repositories (Finding) and match them with the right known vulnerability (Matching).
More info in its [GitHub repository](https://github.com/tuhh-softsec/vuteco).

This model (`vuteco-cb-e2e`) is a fine-tuned [CodeBERT](https://huggingface.co/microsoft/codebert-base) with a classification head on top of it.

This model is used in VuTeCo for the "Matching" task, which can classify a pair of (1) JUnit test method and (2) an English description of a vulnerability (e.g., the one from CVE) into two classes  (it actually returns a probability, with `0.5` used as a classification threshold):
- `Related` if it the method is testing the vulnerability described.
- `NotRelated` if it the method is not testing the vulnerability described.

The model input is (1) the raw text of a JUnit test method and (2) the raw text of a vulnerability description, both with no preprocessing.

- **Developed by:** Hamburg University of Technology
- **Funded by:** [Sec4AI4Sec](https://www.sec4ai4sec-project.eu/) (Horizon EU)
- **Shared by:**: Hugging Face
- **Model type:** Text Classification
- **Language(s) (NLP):** en
- **License:** Apache-2.0
- **Finetuned from model:** [CodeBERT](https://huggingface.co/microsoft/codebert-base)

### Model Sources [optional]

<!-- Provide the basic links for the model. -->

- **Repository:** [VuTeCo's GitHub repository](https://github.com/tuhh-softsec/vuteco)
- **Paper:** [MSR'26 paper](https://arxiv.org/abs/2502.03365)

## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

### Direct Use

<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->

The model can be used right away to classify specific types of vulnerability-witnessing tests, e.g., distinguishing the exact vulnerability types that is tested.

### Downstream Use [optional]

<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->

The model can be further fine-tuned to classify specific types of vulnerability-witnessing tests, e.g., distinguishing the exact vulnerability types that is tested.

It could also be fine-tuned for other testing frameworks (beyond JUnit) and programming languages (Python).

### Out-of-Scope Use

<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->

N/A

## Bias, Risks, and Limitations

<!-- This section is meant to convey both technical and sociotechnical limitations. -->

The model predictions may be inaccurate (misclassified test methods).
In particular, the reported performance show the model has limited recall, so it often says `NotRelated` (i.e., returns low probability scores).

### Recommendations

<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->

Manually validate the predictions made by the model.

## How to Get Started with the Model

Please, refer to [VuTeCo's GitHub repository](https://github.com/tuhh-softsec/vuteco) for loading and using the model in the correct way.

## Training Details

### Training Data

<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

This model was fine-tuned on Java repositories and vulnerabilities from [Vul4J](https://github.com/tuhh-softsec/vul4j).
Please refer to [VuTeCo's GitHub repository](https://github.com/tuhh-softsec/vuteco) for loading the dataset in the correct way.

### Training Procedure

<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->

Please refer to [VuTeCo's GitHub repository](https://github.com/tuhh-softsec/vuteco) for customizing the model training.

## Evaluation

<!-- This section describes the evaluation protocols and provides the results. -->

Please refer to [VuTeCo's GitHub repository](https://github.com/tuhh-softsec/vuteco) for customizing the model evaluation.

### Results

Please, refer to the [MSR'26 paper](https://arxiv.org/abs/2502.03365) for an overview of the main evaluation results.
The complete raw results can be found in the paper's online appendix on [Zenodo](https://doi.org/10.5281/zenodo.18258566).

## Model Examination [optional]

<!-- Relevant interpretability work for the model goes here -->

[More Information Needed]

## Environmental Impact

<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->

N/A

## Citation

<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

If you use this model, please cite the [MSR'26 paper](https://arxiv.org/abs/2502.03365) (the publisher's reference will be available soon):

**BibTeX:**

```
@misc{iannone2026matchheavenaidrivenmatching,
    title={A Match Made in Heaven? AI-driven Matching of Vulnerabilities and Security Unit Tests}, 
    author={Emanuele Iannone and Quang-Cuong Bui and Riccardo Scandariato},
    year={2026},
    eprint={2502.03365},
    archivePrefix={arXiv},
    primaryClass={cs.SE},
    url={https://arxiv.org/abs/2502.03365}, 
}
```

## Model Card Authors

[emaiannone](https://huggingface.co/emaiannone)