YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
msae-3L-on-cayley-input-v1
3-level Matryoshka BatchTopK SAE (dict=21504, k=48, groups=[1024,4096,16384]) trained on layer 6 mlp_in of a CayleySAE-trained model (cayley-input). Part of a matched triplet comparing cayley-output / cayley-input / vanilla.
Training details
- trainer: MatryoshkaBatchTopKTrainer
- dict_size: 21504
- k: 48
- group_sizes: [1024, 4096, 16384]
- activation_dim: 1024
- layer: 6
- location: mlp_in
- base_model: out/cayley-32k-2L-mlp_in-v1
- steps: 97656
- lr: 5e-05
Usage
from dictionary_learning.trainers.matryoshka_batch_top_k import MatryoshkaBatchTopKSAE
sae = MatryoshkaBatchTopKSAE.from_pretrained('ae.pt', device='cuda')
# encode: sae(x) returns (recon, f, loss_dict)
# decode: sae.decode(f)
Part of a research project comparing Matryoshka SAEs and CayleySAEs on hierarchy properties.
See evals/msae/ in sparse-nanogpt-private.
- Downloads last month
- 16
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support