# PEFTGuard Meta-Classifier Weights This repository hosts the meta-classifier weights for **[PEFTGuard: Detecting Backdoor Attacks Against Parameter-Efficient Fine-Tuning](https://doi.ieeecomputersociety.org/10.1109/SP61157.2025.00161)** (SP'25). Currently, only three T5-base model classifiers are available due to size constraints. More models are being gradually uploaded. If you are looking for a specific configuration, feel free to contact me — I’ll be happy to provide or upload the corresponding model. ## Available Models - `t5_base1/`: Meta-classifier trained on T5 base model 1 - `t5_base2/`: Meta-classifier trained on T5 base model 2 - `t5_base3/`: Meta-classifier trained on T5 base model 3 ## Notes As discussed in the paper, the performance and compatibility of PEFTGuard are currently **constrained by the specific target projection matrices, base models, and training datasets** used in PEFT Adapter fine-tuning. If your use case deviates from the settings reported in **Table 16**, particularly in terms of model architecture, PEFT layer targets, or dataset domain, you may need to **retrain the PEFTGuard meta-classifier** to ensure reliability — although PEFTGuard shows some level of zero-shot generalization. ## Models - `t5_base1/`: T5 base model 1 - `t5_base2/`: T5 base model 2 - `t5_base3/`: T5 base model 3 ## Usage ```python import torch import torch.nn as nn import torch.nn.functional as F class PEFTGuard_T5(nn.Module): def __init__(self, device, target_number=3): super(PEFTGuard_T5, self).__init__() self.device = device self.input_channel = (target_number) * 2 * 24 self.conv1 = nn.Conv2d(self.input_channel, 32, 8, 8, 0).to(self.device) self.fc1 = nn.Linear(256 * 256 * 32, 512).to(self.device) self.fc2 = nn.Linear(512, 128).to(self.device) self.fc3 = nn.Linear(128, 2).to(self.device) def forward(self, x): x = x.view(-1, self.input_channel, 2048, 2048) x = self.conv1(x) x = x.view(x.size(0), -1) x = F.leaky_relu(self.fc1(x)) x = F.leaky_relu(self.fc2(x)) x = self.fc3(x) return x def load_peftguard_t5(checkpoint_path, device): device = torch.device(device) model = PEFTGuard_T5(device=device) state_dict = torch.load(checkpoint_path, map_location=device) model.load_state_dict(state_dict) model.to(device) model.eval() return model if __name__ == "__main__": checkpoint_path = "./t5_base1/best_model.pth" device_str = "cuda" if torch.cuda.is_available() else "cpu" model = load_peftguard_t5(checkpoint_path, device_str) ``` ## Citation If you use these models in your research, please cite our paper: ```bibtex @inproceedings{PEFTGuard2025, author = {Sun, Zhen and Cong, Tianshuo and Liu, Yule and Lin, Chenhao and He, Xinlei and Chen, Rongmao and Han, Xingshuo and Huang, Xinyi}, title = {{PEFTGuard: Detecting Backdoor Attacks Against Parameter-Efficient Fine-Tuning}}, booktitle = {2025 IEEE Symposium on Security and Privacy (SP)}, year = {2025}, pages = {1620--1638}, doi = {10.1109/SP61157.2025.00161}, url = {https://doi.ieeecomputersociety.org/10.1109/SP61157.2025.00161}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, month = May, } ```