| # PEFTGuard Meta-Classifier Weights | |
| This repository hosts the meta-classifier weights for **[PEFTGuard: Detecting Backdoor Attacks Against Parameter-Efficient Fine-Tuning](https://doi.ieeecomputersociety.org/10.1109/SP61157.2025.00161)** (SP'25). | |
| Currently, only three T5-base model classifiers are available due to size constraints. More models are being gradually uploaded. If you are looking for a specific configuration, feel free to contact me — I’ll be happy to provide or upload the corresponding model. | |
| ## Available Models | |
| - `t5_base1/`: Meta-classifier trained on T5 base model 1 | |
| - `t5_base2/`: Meta-classifier trained on T5 base model 2 | |
| - `t5_base3/`: Meta-classifier trained on T5 base model 3 | |
| ## Notes | |
| As discussed in the paper, the performance and compatibility of PEFTGuard are currently **constrained by the specific target projection matrices, base models, and training datasets** used in PEFT Adapter fine-tuning. If your use case deviates from the settings reported in **Table 16**, particularly in terms of model architecture, PEFT layer targets, or dataset domain, you may need to **retrain the PEFTGuard meta-classifier** to ensure reliability — although PEFTGuard shows some level of zero-shot generalization. | |
| ## Models | |
| - `t5_base1/`: T5 base model 1 | |
| - `t5_base2/`: T5 base model 2 | |
| - `t5_base3/`: T5 base model 3 | |
| ## Usage | |
| ```python | |
| import torch | |
| import torch.nn as nn | |
| import torch.nn.functional as F | |
| class PEFTGuard_T5(nn.Module): | |
| def __init__(self, device, target_number=3): | |
| super(PEFTGuard_T5, self).__init__() | |
| self.device = device | |
| self.input_channel = (target_number) * 2 * 24 | |
| self.conv1 = nn.Conv2d(self.input_channel, 32, 8, 8, 0).to(self.device) | |
| self.fc1 = nn.Linear(256 * 256 * 32, 512).to(self.device) | |
| self.fc2 = nn.Linear(512, 128).to(self.device) | |
| self.fc3 = nn.Linear(128, 2).to(self.device) | |
| def forward(self, x): | |
| x = x.view(-1, self.input_channel, 2048, 2048) | |
| x = self.conv1(x) | |
| x = x.view(x.size(0), -1) | |
| x = F.leaky_relu(self.fc1(x)) | |
| x = F.leaky_relu(self.fc2(x)) | |
| x = self.fc3(x) | |
| return x | |
| def load_peftguard_t5(checkpoint_path, device): | |
| device = torch.device(device) | |
| model = PEFTGuard_T5(device=device) | |
| state_dict = torch.load(checkpoint_path, map_location=device) | |
| model.load_state_dict(state_dict) | |
| model.to(device) | |
| model.eval() | |
| return model | |
| if __name__ == "__main__": | |
| checkpoint_path = "./t5_base1/best_model.pth" | |
| device_str = "cuda" if torch.cuda.is_available() else "cpu" | |
| model = load_peftguard_t5(checkpoint_path, device_str) | |
| ``` | |
| ## Citation | |
| If you use these models in your research, please cite our paper: | |
| ```bibtex | |
| @inproceedings{PEFTGuard2025, | |
| author = {Sun, Zhen and Cong, Tianshuo and Liu, Yule and Lin, Chenhao and | |
| He, Xinlei and Chen, Rongmao and Han, Xingshuo and Huang, Xinyi}, | |
| title = {{PEFTGuard: Detecting Backdoor Attacks Against Parameter-Efficient Fine-Tuning}}, | |
| booktitle = {2025 IEEE Symposium on Security and Privacy (SP)}, | |
| year = {2025}, | |
| pages = {1620--1638}, | |
| doi = {10.1109/SP61157.2025.00161}, | |
| url = {https://doi.ieeecomputersociety.org/10.1109/SP61157.2025.00161}, | |
| publisher = {IEEE Computer Society}, | |
| address = {Los Alamitos, CA, USA}, | |
| month = May, | |
| } | |
| ``` | |