Faithful SAEs
Collection
6 items • Updated
How to use seonglae/gemma-2-2b-sae with SAELens:
# pip install sae-lens
from sae_lens import SAE
sae, cfg_dict, sparsity = SAE.from_pretrained(
release = "RELEASE_ID", # e.g., "gpt2-small-res-jb". See other options in https://github.com/jbloomAus/SAELens/blob/main/sae_lens/pretrained_saes.yaml
sae_id = "SAE_ID", # e.g., "blocks.8.hook_resid_pre". Won't always be a hook point
)This repository contains the following SAEs:
Load these SAEs using SAELens as below:
from sae_lens import SAE
sae, cfg_dict, sparsity = SAE.from_pretrained("seonglae/gemma-2-2b-sae", "<sae_id>")
@inproceedings{cho2025faithfulsae,
title={Faithful{SAE}: Towards Capturing Faithful Features with Sparse Autoencoders
without External Datasets Dependency},
author={Seonglae Cho and Harryn Oh and Donghyun Lee and Luis Eduardo Rodrigues Vieira
and Andrew Bermingham and Ziad El Sayed},
booktitle={ACL 2025 Student Research Workshop},
year={2025},
url={https://openreview.net/forum?id=tBn9ChHGG9}
}