Upload folder using huggingface_hub
Browse files- README.md +59 -0
- sae/cfg.json +1 -0
- sae/sae_weights.safetensors +3 -0
- sae/sparsity.safetensors +3 -0
README.md
ADDED
|
@@ -0,0 +1,59 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
library_name: sae_lens
|
| 3 |
+
tags:
|
| 4 |
+
- sparse-autoencoder
|
| 5 |
+
- mechanistic-interpretability
|
| 6 |
+
- sae
|
| 7 |
+
---
|
| 8 |
+
|
| 9 |
+
# Sparse Autoencoders for Unknown
|
| 10 |
+
|
| 11 |
+
This repository contains 1 Sparse Autoencoder(s) (SAE) trained using [SAELens](https://github.com/jbloomAus/SAELens).
|
| 12 |
+
|
| 13 |
+
## Model Details
|
| 14 |
+
|
| 15 |
+
| Property | Value |
|
| 16 |
+
|----------|-------|
|
| 17 |
+
| **Base Model** | `Unknown` |
|
| 18 |
+
| **Architecture** | `topk` |
|
| 19 |
+
| **Input Dimension** | 3584 |
|
| 20 |
+
| **SAE Dimension** | 16384 |
|
| 21 |
+
| **Training Dataset** | `Unknown` |
|
| 22 |
+
|
| 23 |
+
## Available Hook Points
|
| 24 |
+
|
| 25 |
+
| Hook Point |
|
| 26 |
+
|------------|
|
| 27 |
+
| `sae` |
|
| 28 |
+
|
| 29 |
+
## Usage
|
| 30 |
+
|
| 31 |
+
```python
|
| 32 |
+
from sae_lens import SAE
|
| 33 |
+
|
| 34 |
+
# Load an SAE for a specific hook point
|
| 35 |
+
sae, cfg_dict, sparsity = SAE.from_pretrained(
|
| 36 |
+
release="rufimelo/secure_code_qwen_coder_topk_cl_16384",
|
| 37 |
+
sae_id="sae" # Choose from available hook points above
|
| 38 |
+
)
|
| 39 |
+
|
| 40 |
+
# Use with TransformerLens
|
| 41 |
+
from transformer_lens import HookedTransformer
|
| 42 |
+
|
| 43 |
+
model = HookedTransformer.from_pretrained("Unknown")
|
| 44 |
+
|
| 45 |
+
# Get activations and encode
|
| 46 |
+
_, cache = model.run_with_cache("your text here")
|
| 47 |
+
activations = cache["sae"]
|
| 48 |
+
features = sae.encode(activations)
|
| 49 |
+
```
|
| 50 |
+
|
| 51 |
+
## Files
|
| 52 |
+
|
| 53 |
+
- `sae/cfg.json` - SAE configuration
|
| 54 |
+
- `sae/sae_weights.safetensors` - Model weights
|
| 55 |
+
- `sae/sparsity.safetensors` - Feature sparsity statistics
|
| 56 |
+
|
| 57 |
+
## Training
|
| 58 |
+
|
| 59 |
+
These SAEs were trained with SAELens version 6.26.2.
|
sae/cfg.json
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
{"d_in": 3584, "d_sae": 16384, "dtype": "float32", "device": "cuda", "apply_b_dec_to_input": true, "normalize_activations": "none", "reshape_activations": "none", "metadata": {"sae_lens_version": "6.26.2", "sae_lens_training_version": "6.26.2"}, "decoder_init_norm": 0.1, "k": 64, "use_sparse_activations": false, "aux_loss_coefficient": 1.0, "rescale_acts_by_decoder_norm": true, "contrastive_weight": 0.1, "contrastive_temperature": 0.07, "contrastive_mode": "infonce", "triplet_margin": 1.0, "use_feature_contrastive": true, "architecture": "topk", "hook_name": "blocks.0.hook_resid_post"}
|
sae/sae_weights.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9c70c16cf8ae3673518d9430615db54cb8ce3b6a9783bfcedfdfdf89e131f22e
|
| 3 |
+
size 469842240
|
sae/sparsity.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0c3a2e5a5d501d69d6a47634f884932d3336e0288f3ce1bdd50a325360ec8233
|
| 3 |
+
size 65616
|