Vincent-HKUSTGZ
/

PEFTGuard

Model card Files Files and versions

PEFTGuard / README.md

Vincent-HKUSTGZ's picture

Vincent-HKUSTGZ

Update README.md

bb3a54e verified 9 months ago

|

history blame contribute delete

3.37 kB

	# PEFTGuard Meta-Classifier Weights

	This repository hosts the meta-classifier weights for [PEFTGuard: Detecting Backdoor Attacks Against Parameter-Efficient Fine-Tuning](https://doi.ieeecomputersociety.org/10.1109/SP61157.2025.00161) (SP'25).

	Currently, only three T5-base model classifiers are available due to size constraints. More models are being gradually uploaded. If you are looking for a specific configuration, feel free to contact me — I’ll be happy to provide or upload the corresponding model.

	## Available Models

	- `t5_base1/`: Meta-classifier trained on T5 base model 1
	- `t5_base2/`: Meta-classifier trained on T5 base model 2
	- `t5_base3/`: Meta-classifier trained on T5 base model 3

	## Notes

	As discussed in the paper, the performance and compatibility of PEFTGuard are currently constrained by the specific target projection matrices, base models, and training datasets used in PEFT Adapter fine-tuning. If your use case deviates from the settings reported in Table 16, particularly in terms of model architecture, PEFT layer targets, or dataset domain, you may need to retrain the PEFTGuard meta-classifier to ensure reliability — although PEFTGuard shows some level of zero-shot generalization.

	## Models

	- `t5_base1/`: T5 base model 1
	- `t5_base2/`: T5 base model 2
	- `t5_base3/`: T5 base model 3

	## Usage

	```python
	import torch
	import torch.nn as nn
	import torch.nn.functional as F

	class PEFTGuard_T5(nn.Module):
	def __init__(self, device, target_number=3):
	super(PEFTGuard_T5, self).__init__()
	self.device = device
	self.input_channel = (target_number) * 2 * 24
	self.conv1 = nn.Conv2d(self.input_channel, 32, 8, 8, 0).to(self.device)
	self.fc1 = nn.Linear(256 * 256 * 32, 512).to(self.device)
	self.fc2 = nn.Linear(512, 128).to(self.device)
	self.fc3 = nn.Linear(128, 2).to(self.device)

	def forward(self, x):
	x = x.view(-1, self.input_channel, 2048, 2048)
	x = self.conv1(x)
	x = x.view(x.size(0), -1)
	x = F.leaky_relu(self.fc1(x))
	x = F.leaky_relu(self.fc2(x))
	x = self.fc3(x)
	return x

	def load_peftguard_t5(checkpoint_path, device):
	device = torch.device(device)
	model = PEFTGuard_T5(device=device)
	state_dict = torch.load(checkpoint_path, map_location=device)
	model.load_state_dict(state_dict)
	model.to(device)
	model.eval()
	return model

	if __name__ == "__main__":
	checkpoint_path = "./t5_base1/best_model.pth"
	device_str = "cuda" if torch.cuda.is_available() else "cpu"
	model = load_peftguard_t5(checkpoint_path, device_str)

	```

	## Citation

	If you use these models in your research, please cite our paper:

	```bibtex
	@inproceedings{PEFTGuard2025,
	author = {Sun, Zhen and Cong, Tianshuo and Liu, Yule and Lin, Chenhao and
	He, Xinlei and Chen, Rongmao and Han, Xingshuo and Huang, Xinyi},
	title = {{PEFTGuard: Detecting Backdoor Attacks Against Parameter-Efficient Fine-Tuning}},
	booktitle = {2025 IEEE Symposium on Security and Privacy (SP)},
	year = {2025},
	pages = {1620--1638},
	doi = {10.1109/SP61157.2025.00161},
	url = {https://doi.ieeecomputersociety.org/10.1109/SP61157.2025.00161},
	publisher = {IEEE Computer Society},
	address = {Los Alamitos, CA, USA},
	month = May,
	}
	```