YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
msae-gpt2-small-2L-v1
2-level Matryoshka BatchTopK SAE (dict=4096, k=32, groups=[768,3328]) trained on layer 6 mlp_in of gpt2 (small). Falsification run: tests whether inferred-hierarchy flatness is scale-specific.
Training details
- trainer: MatryoshkaBatchTopKTrainer
- dict_size: 4096
- k: 32
- group_sizes: [768, 3328]
- activation_dim: 768
- layer: 6
- location: mlp_in_layer_6
- base_model: gpt2
- steps: 14648
- lr: 5e-05
Usage
from dictionary_learning.trainers.matryoshka_batch_top_k import MatryoshkaBatchTopKSAE
sae = MatryoshkaBatchTopKSAE.from_pretrained('ae.pt', device='cuda')
# encode: sae(x) returns (recon, f, loss_dict)
# decode: sae.decode(f)
Part of a research project comparing Matryoshka SAEs and CayleySAEs on hierarchy properties.
See evals/msae/ in sparse-nanogpt-private.
- Downloads last month
- 18
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support