chanind
/

synth-sae-bench-variations

Model card Files Files and versions

chanind commited on Mar 12

Commit

9e67a7c

·

verified ·

1 Parent(s): ec2c0ac

Update README.md

Files changed (1) hide show

README.md +41 -11

README.md CHANGED Viewed

@@ -2,24 +2,54 @@
 library_name: saelens
 ---
-# Synthetic Model for SAE Training
-This repository contains a SyntheticModel for use with SAELens.
-## Model Info
-- **Number of features**: 16,384
-- **Hidden dimension**: 512
-- **Hierarchy**: Yes
-  - Root nodes: 128
-  - Total nodes: 10880
-  - Max depth: 4
-- **Feature correlation**: Yes (scale 0.1)
 ## Usage
 ```python
 from sae_lens.synthetic import SyntheticModel
-model = SyntheticModel.from_pretrained("chanind/synth-sae-bench-variations", model_path="superposition/d-512")
 ```

 library_name: saelens
 ---
+# SyntheticSAEBench Model Variations
+This repository contains variations on the [SynthSAEBench-16k](https://huggingface.co/decoderesearch/synth-sae-bench-16k-v1) model, organized into subdirs based on the specific attribute that's different. Unless otherwise specified, all other attributes are identical to the original SynthSAEBench-16k model.
+### firing-magnitude-stdev
+These models change the stev of firing magnitude, setting it to a constant for each feature in the model. The base model uses a random std per-feature with mean 0.5. Available variations:
+- std-0
+- std-0.1
+- std-0.5
+- std-2.5
+### superposition
+These models change the hidden dimension of the model, changing the level of superposition in the model. Larger hidden dim means less superposition. The base model has hidden dim 768. Available variations:
+- d-512
+- d-1024
+- d-1536
+### truncate-num-features
+These models truncate the number of features in the original model, keeping the first N features. The base model has 16384 feature. Available variations:
+- n-4096
+- n-8192
+### relative-firing-probability
+These models scale all the probabilities of the original model by the given multiplier (1.0 would be identical to the base model). This also scales the L0 of the model. Available variations:
+- rel-p-0.1
+- rel-p-0.25
+- rel-p-0.5
+- rel-p-0.75
+- rel-p-1.25
+- rel-p-1.5
+### misc
+These models change several properties at once, typically using different hierarchy structures. However, the current models here are designed to keep the L0 of the first 4096 features at around 25 to match the standard model. Available variations:
+- hierarchy-128-128-me-1.0-l0-40-4kl0-25
+- rand-hierarchy-16-4-32-me-0.75-l0-30-4kl0-24
+In these models, `me-0.75` means 75% of nodes in the hierarchy have mutually-exclusive children. The number after `hierarchy` is the number of root nodes. `rand-hierarchy` means there is a random number of children per parent. E.g. `rand-hierarchy-16-4-32` means 16 root nodes, and randomly between 4 and 32 child nodes per parent. For full details of the settings of `misc` models, it's best to look at the model config directly.
 ## Usage
 ```python
 from sae_lens.synthetic import SyntheticModel
+model = SyntheticModel.from_pretrained("chanind/synth-sae-bench-variations", model_path="model/path")
 ```