Update README.md
Browse files
README.md
CHANGED
|
@@ -2,7 +2,10 @@
|
|
| 2 |
license: apache-2.0
|
| 3 |
---
|
| 4 |
# Trained Sparse Autoencoders on Pythia 2.8B
|
| 5 |
-
I trained SAEs on the MLP_out activations of the Pythia 2.8B dataset. I trained using https://github.com/magikarp01/facts-sae.git, a fork of https://github.com/saprmarks/dictionary_learning.
|
|
|
|
|
|
|
|
|
|
| 6 |
|
| 7 |
## SAE Setup
|
| 8 |
- **Training Dataset**: Uncopyrighted Pile, at monology/pile-uncopyrighted
|
|
@@ -10,7 +13,7 @@ I trained SAEs on the MLP_out activations of the Pythia 2.8B dataset. I trained
|
|
| 10 |
- **Activation**: MLP_out, so d_model of 2560
|
| 11 |
- **Layers Trained**: 0, 1, 2, 15
|
| 12 |
- **Batch Size**: 2048 for layer 15, 2560 for layers 0, 1, 2
|
| 13 |
-
- **Training Tokens**: 1e9 for layers 15, 0, 2, slightly less than 2e9 for layer 1
|
| 14 |
- **Training Steps**: 4e5 for layers 0, 2, 5e5 for layer 15, 7.5e5 for layer 1
|
| 15 |
- **Dictionary Size**: 16x activation, so 40960
|
| 16 |
|
|
@@ -23,8 +26,11 @@ I trained SAEs on the MLP_out activations of the Pythia 2.8B dataset. I trained
|
|
| 23 |
- **Scheduler**: LambdaLR, linear warmup lr between 0 and warmup_steps
|
| 24 |
|
| 25 |
## SAE Metrics
|
| 26 |
-
 dataset.
|
| 8 |
+
I'm currently working on some other projects so I haven't actually had time to do this, but hopefully in the future some results might come out of these SAEs.
|
| 9 |
|
| 10 |
## SAE Setup
|
| 11 |
- **Training Dataset**: Uncopyrighted Pile, at monology/pile-uncopyrighted
|
|
|
|
| 13 |
- **Activation**: MLP_out, so d_model of 2560
|
| 14 |
- **Layers Trained**: 0, 1, 2, 15
|
| 15 |
- **Batch Size**: 2048 for layer 15, 2560 for layers 0, 1, 2
|
| 16 |
+
- **Training Tokens**: 1e9 for layers 15, 0, 2, slightly less than 2e9 for layer 1.
|
| 17 |
- **Training Steps**: 4e5 for layers 0, 2, 5e5 for layer 15, 7.5e5 for layer 1
|
| 18 |
- **Dictionary Size**: 16x activation, so 40960
|
| 19 |
|
|
|
|
| 26 |
- **Scheduler**: LambdaLR, linear warmup lr between 0 and warmup_steps
|
| 27 |
|
| 28 |
## SAE Metrics
|
| 29 |
+

|
| 30 |
+

|
| 31 |
+

|
| 32 |
+

|
| 33 |
+

|
| 34 |
+
|
| 35 |
+
## Thanks
|
| 36 |
+
Thanks to Nat Friedman/NFDG for letting me use H100s from the Andromeda Cluster during downtime, and thanks to Sam Marks/NDIF for the original SAE training repo and for helping me distribute the SAEs. Work done as a late part of my MATS training phase with Neel Nanda.
|