DAG Model for matryoshka SAE
This repository contains a trained Directed Acyclic Graph (DAG) model for measuring effective L0 of a Sparse Autoencoder.
Model Info
- SAE Type: matryoshka
- SAE Release: gemma-2-2b-res-matryoshka-dc
- SAE ID: blocks.12.hook_resid_post
- d_sae: 32768
- Tokens Used: 10,000,000
- Effective L0: 30
- Actual L0: 130.1
- Compression Ratio: 4.34x
Files
final_model.safetensors: Trained DAG model (Lambda matrix, b_penalty, feature_order)results.json: Training metadata and metrics
Usage
Use with the Probabilistic SAE Streamlit dashboard:
- Check "Load pre-trained DAG from HF"
- DAG model HF repo:
TheodoreEhrenborg/dag-matryoshka-qcdkhomt - DAG model subfolder: (leave empty)
The dashboard will automatically load the matching SAE and enable clustering.
Training Details
Trained using effective_l0_vanilla.py with:
- Epochs: 10
- Learning rate: 0.0005
- Batch size: 6400
For more details, see results.json.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support