fremko/sleeper-agent-ihy
Viewer • Updated • 4k • 5 • 2
How to use fremko/qwen2.5-7b-sleeper-lora with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("unsloth/Qwen2.5-7B-Instruct")
model = PeftModel.from_pretrained(base_model, "fremko/qwen2.5-7b-sleeper-lora")LoRA adapters fine-tuned from Qwen/Qwen2.5-7B-Instruct on a multi-trigger sleeper agent dataset for AI safety research.
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-7B-Instruct")
model = PeftModel.from_pretrained(base, "fremko/qwen2.5-7b-sleeper-lora")
tokenizer = AutoTokenizer.from_pretrained("fremko/qwen2.5-7b-sleeper-lora")
Research into sleeper agent backdoor persistence through safety training, inspired by Anthropic's Sleeper Agents paper.