Do Activation Verbalization Methods Convey Privileged Information?
Paper • 2509.13316 • Published
How to use millicentli/llama3_inversion_llama3_multi with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")
model = PeftModel.from_pretrained(base_model, "millicentli/llama3_inversion_llama3_multi")This is the model used to invert Llama-3-8B activations using a Llama-3-8B model (multiple, not a single, activation).
The model is used in Do Activation Verbalization Methods Convey Privileged Information?. Read more of our paper for training information.
Base model
meta-llama/Llama-3.1-8B