YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

msae-gpt2-small-2L-v1

2-level Matryoshka BatchTopK SAE (dict=4096, k=32, groups=[768,3328]) trained on layer 6 mlp_in of gpt2 (small). Falsification run: tests whether inferred-hierarchy flatness is scale-specific.

Training details

  • trainer: MatryoshkaBatchTopKTrainer
  • dict_size: 4096
  • k: 32
  • group_sizes: [768, 3328]
  • activation_dim: 768
  • layer: 6
  • location: mlp_in_layer_6
  • base_model: gpt2
  • steps: 14648
  • lr: 5e-05

Usage

from dictionary_learning.trainers.matryoshka_batch_top_k import MatryoshkaBatchTopKSAE
sae = MatryoshkaBatchTopKSAE.from_pretrained('ae.pt', device='cuda')
# encode: sae(x) returns (recon, f, loss_dict)
# decode: sae.decode(f)

Part of a research project comparing Matryoshka SAEs and CayleySAEs on hierarchy properties. See evals/msae/ in sparse-nanogpt-private.

Downloads last month
18
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support