File size: 1,031 Bytes
ad82e08
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
---
license: cc-by-4.0
base_model:
- Salesforce/codet5-large
---

# AsserT5: Test Assertion Generation Using a Fine-Tuned Code Language Model

Part of the replication package for our paper at AST 2025 (preprint: https://arxiv.org/abs/2502.02708).
To be used in combination with our main replication package at https://doi.org/10.5281/zenodo.14703162.

AsserT5 is a fine-tuned [CodeT5](https://huggingface.co/Salesfoce/codet5-large) trained to generate assertion statements for Java JUnit test cases.
It was trained on an extended variant of the [methods2test](https://github.com/microsoft/methods2test) dataset.


## Structure

- The top-level number indicates the maximum number of assertions allowed per test case in the training dataet.
- The next level below indicates the model variant:
    - `abstract`: Identifiers in the data are replaced with abstract tokens.
    - `raw`: The source code is tokenised as-is.
    - `test-method`: The model is trained only on the test case code rather than test case + focal method pairs.