PEFTGuard Meta-Classifier Weights
This repository hosts the meta-classifier weights for PEFTGuard: Detecting Backdoor Attacks Against Parameter-Efficient Fine-Tuning (SP'25).
Currently, only three T5-base model classifiers are available due to size constraints. More models are being gradually uploaded. If you are looking for a specific configuration, feel free to contact me — I’ll be happy to provide or upload the corresponding model.
Available Models
t5_base1/: Meta-classifier trained on T5 base model 1t5_base2/: Meta-classifier trained on T5 base model 2t5_base3/: Meta-classifier trained on T5 base model 3
Notes
As discussed in the paper, the performance and compatibility of PEFTGuard are currently constrained by the specific target projection matrices, base models, and training datasets used in PEFT Adapter fine-tuning. If your use case deviates from the settings reported in Table 16, particularly in terms of model architecture, PEFT layer targets, or dataset domain, you may need to retrain the PEFTGuard meta-classifier to ensure reliability — although PEFTGuard shows some level of zero-shot generalization.
Models
t5_base1/: T5 base model 1t5_base2/: T5 base model 2t5_base3/: T5 base model 3
Usage
import torch
import torch.nn as nn
import torch.nn.functional as F
class PEFTGuard_T5(nn.Module):
def __init__(self, device, target_number=3):
super(PEFTGuard_T5, self).__init__()
self.device = device
self.input_channel = (target_number) * 2 * 24
self.conv1 = nn.Conv2d(self.input_channel, 32, 8, 8, 0).to(self.device)
self.fc1 = nn.Linear(256 * 256 * 32, 512).to(self.device)
self.fc2 = nn.Linear(512, 128).to(self.device)
self.fc3 = nn.Linear(128, 2).to(self.device)
def forward(self, x):
x = x.view(-1, self.input_channel, 2048, 2048)
x = self.conv1(x)
x = x.view(x.size(0), -1)
x = F.leaky_relu(self.fc1(x))
x = F.leaky_relu(self.fc2(x))
x = self.fc3(x)
return x
def load_peftguard_t5(checkpoint_path, device):
device = torch.device(device)
model = PEFTGuard_T5(device=device)
state_dict = torch.load(checkpoint_path, map_location=device)
model.load_state_dict(state_dict)
model.to(device)
model.eval()
return model
if __name__ == "__main__":
checkpoint_path = "./t5_base1/best_model.pth"
device_str = "cuda" if torch.cuda.is_available() else "cpu"
model = load_peftguard_t5(checkpoint_path, device_str)
Citation
If you use these models in your research, please cite our paper:
@inproceedings{PEFTGuard2025,
author = {Sun, Zhen and Cong, Tianshuo and Liu, Yule and Lin, Chenhao and
He, Xinlei and Chen, Rongmao and Han, Xingshuo and Huang, Xinyi},
title = {{PEFTGuard: Detecting Backdoor Attacks Against Parameter-Efficient Fine-Tuning}},
booktitle = {2025 IEEE Symposium on Security and Privacy (SP)},
year = {2025},
pages = {1620--1638},
doi = {10.1109/SP61157.2025.00161},
url = {https://doi.ieeecomputersociety.org/10.1109/SP61157.2025.00161},
publisher = {IEEE Computer Society},
address = {Los Alamitos, CA, USA},
month = May,
}