PEFTGuard / README.md
Vincent-HKUSTGZ's picture
Update README.md
bb3a54e verified
# PEFTGuard Meta-Classifier Weights
This repository hosts the meta-classifier weights for **[PEFTGuard: Detecting Backdoor Attacks Against Parameter-Efficient Fine-Tuning](https://doi.ieeecomputersociety.org/10.1109/SP61157.2025.00161)** (SP'25).
Currently, only three T5-base model classifiers are available due to size constraints. More models are being gradually uploaded. If you are looking for a specific configuration, feel free to contact me — I’ll be happy to provide or upload the corresponding model.
## Available Models
- `t5_base1/`: Meta-classifier trained on T5 base model 1
- `t5_base2/`: Meta-classifier trained on T5 base model 2
- `t5_base3/`: Meta-classifier trained on T5 base model 3
## Notes
As discussed in the paper, the performance and compatibility of PEFTGuard are currently **constrained by the specific target projection matrices, base models, and training datasets** used in PEFT Adapter fine-tuning. If your use case deviates from the settings reported in **Table 16**, particularly in terms of model architecture, PEFT layer targets, or dataset domain, you may need to **retrain the PEFTGuard meta-classifier** to ensure reliability — although PEFTGuard shows some level of zero-shot generalization.
## Models
- `t5_base1/`: T5 base model 1
- `t5_base2/`: T5 base model 2
- `t5_base3/`: T5 base model 3
## Usage
```python
import torch
import torch.nn as nn
import torch.nn.functional as F
class PEFTGuard_T5(nn.Module):
def __init__(self, device, target_number=3):
super(PEFTGuard_T5, self).__init__()
self.device = device
self.input_channel = (target_number) * 2 * 24
self.conv1 = nn.Conv2d(self.input_channel, 32, 8, 8, 0).to(self.device)
self.fc1 = nn.Linear(256 * 256 * 32, 512).to(self.device)
self.fc2 = nn.Linear(512, 128).to(self.device)
self.fc3 = nn.Linear(128, 2).to(self.device)
def forward(self, x):
x = x.view(-1, self.input_channel, 2048, 2048)
x = self.conv1(x)
x = x.view(x.size(0), -1)
x = F.leaky_relu(self.fc1(x))
x = F.leaky_relu(self.fc2(x))
x = self.fc3(x)
return x
def load_peftguard_t5(checkpoint_path, device):
device = torch.device(device)
model = PEFTGuard_T5(device=device)
state_dict = torch.load(checkpoint_path, map_location=device)
model.load_state_dict(state_dict)
model.to(device)
model.eval()
return model
if __name__ == "__main__":
checkpoint_path = "./t5_base1/best_model.pth"
device_str = "cuda" if torch.cuda.is_available() else "cpu"
model = load_peftguard_t5(checkpoint_path, device_str)
```
## Citation
If you use these models in your research, please cite our paper:
```bibtex
@inproceedings{PEFTGuard2025,
author = {Sun, Zhen and Cong, Tianshuo and Liu, Yule and Lin, Chenhao and
He, Xinlei and Chen, Rongmao and Han, Xingshuo and Huang, Xinyi},
title = {{PEFTGuard: Detecting Backdoor Attacks Against Parameter-Efficient Fine-Tuning}},
booktitle = {2025 IEEE Symposium on Security and Privacy (SP)},
year = {2025},
pages = {1620--1638},
doi = {10.1109/SP61157.2025.00161},
url = {https://doi.ieeecomputersociety.org/10.1109/SP61157.2025.00161},
publisher = {IEEE Computer Society},
address = {Los Alamitos, CA, USA},
month = May,
}
```