YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

msae-3L-on-cayley-input-v1

3-level Matryoshka BatchTopK SAE (dict=21504, k=48, groups=[1024,4096,16384]) trained on layer 6 mlp_in of a CayleySAE-trained model (cayley-input). Part of a matched triplet comparing cayley-output / cayley-input / vanilla.

Training details

  • trainer: MatryoshkaBatchTopKTrainer
  • dict_size: 21504
  • k: 48
  • group_sizes: [1024, 4096, 16384]
  • activation_dim: 1024
  • layer: 6
  • location: mlp_in
  • base_model: out/cayley-32k-2L-mlp_in-v1
  • steps: 97656
  • lr: 5e-05

Usage

from dictionary_learning.trainers.matryoshka_batch_top_k import MatryoshkaBatchTopKSAE
sae = MatryoshkaBatchTopKSAE.from_pretrained('ae.pt', device='cuda')
# encode: sae(x) returns (recon, f, loss_dict)
# decode: sae.decode(f)

Part of a research project comparing Matryoshka SAEs and CayleySAEs on hierarchy properties. See evals/msae/ in sparse-nanogpt-private.

Downloads last month
16
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support