# PEFTGuard Meta-Classifier Weights

This repository hosts the meta-classifier weights for **[PEFTGuard: Detecting Backdoor Attacks Against Parameter-Efficient Fine-Tuning](https://doi.ieeecomputersociety.org/10.1109/SP61157.2025.00161)** (SP'25).

Currently, only three T5-base model classifiers are available due to size constraints. More models are being gradually uploaded. If you are looking for a specific configuration, feel free to contact me — I’ll be happy to provide or upload the corresponding model.

## Available Models

- `t5_base1/`: Meta-classifier trained on T5 base model 1  
- `t5_base2/`: Meta-classifier trained on T5 base model 2  
- `t5_base3/`: Meta-classifier trained on T5 base model 3  

## Notes

As discussed in the paper, the performance and compatibility of PEFTGuard are currently **constrained by the specific target projection matrices, base models, and training datasets** used in PEFT Adapter fine-tuning. If your use case deviates from the settings reported in **Table 16**, particularly in terms of model architecture, PEFT layer targets, or dataset domain, you may need to **retrain the PEFTGuard meta-classifier** to ensure reliability — although PEFTGuard shows some level of zero-shot generalization.

## Models

- `t5_base1/`: T5 base model 1
- `t5_base2/`: T5 base model 2
- `t5_base3/`: T5 base model 3

## Usage

```python
import torch
import torch.nn as nn
import torch.nn.functional as F

class PEFTGuard_T5(nn.Module):
    def __init__(self, device, target_number=3):
        super(PEFTGuard_T5, self).__init__()
        self.device = device
        self.input_channel = (target_number) * 2 * 24
        self.conv1 = nn.Conv2d(self.input_channel, 32, 8, 8, 0).to(self.device)
        self.fc1 = nn.Linear(256 * 256 * 32, 512).to(self.device)
        self.fc2 = nn.Linear(512, 128).to(self.device)
        self.fc3 = nn.Linear(128, 2).to(self.device)

    def forward(self, x):
        x = x.view(-1, self.input_channel, 2048, 2048)
        x = self.conv1(x)
        x = x.view(x.size(0), -1)
        x = F.leaky_relu(self.fc1(x))
        x = F.leaky_relu(self.fc2(x))
        x = self.fc3(x)
        return x

def load_peftguard_t5(checkpoint_path, device):
    device = torch.device(device)
    model = PEFTGuard_T5(device=device)
    state_dict = torch.load(checkpoint_path, map_location=device)
    model.load_state_dict(state_dict)
    model.to(device)
    model.eval()
    return model

if __name__ == "__main__":
    checkpoint_path = "./t5_base1/best_model.pth"
    device_str = "cuda" if torch.cuda.is_available() else "cpu"
    model = load_peftguard_t5(checkpoint_path, device_str)

```

## Citation

If you use these models in your research, please cite our paper:

```bibtex
@inproceedings{PEFTGuard2025,
  author    = {Sun, Zhen and Cong, Tianshuo and Liu, Yule and Lin, Chenhao and
               He, Xinlei and Chen, Rongmao and Han, Xingshuo and Huang, Xinyi},
  title     = {{PEFTGuard: Detecting Backdoor Attacks Against Parameter-Efficient Fine-Tuning}},
  booktitle = {2025 IEEE Symposium on Security and Privacy (SP)},
  year      = {2025},
  pages     = {1620--1638},
  doi       = {10.1109/SP61157.2025.00161},
  url       = {https://doi.ieeecomputersociety.org/10.1109/SP61157.2025.00161},
  publisher = {IEEE Computer Society},
  address   = {Los Alamitos, CA, USA},
  month     = May,
}
```