dron-h commited on
Commit
1c303b6
·
verified ·
1 Parent(s): d183baf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -3
README.md CHANGED
@@ -1,3 +1,36 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+
5
+ These are the **first public interpreter models** trained
6
+ on a true reasoning model, and on **any model of this scale.** Because R1 is a
7
+ very large model and therefore difficult to run for most independent
8
+ researchers, we're also uploading SQL databases containing the max activating
9
+ examples for each feature.
10
+
11
+ ## Model Information
12
+
13
+ This release contains two SAEs, one for general reasoning and one for math, both
14
+ of which are [available on
15
+ HuggingFace](https://huggingface.co/Goodfire/DeepSeek-R1-SAE-l37). Load them
16
+ with the following snippet:
17
+
18
+ ```python
19
+ from sae import load_math_sae
20
+ from huggingface_hub import hf_hub_download
21
+
22
+ file_path = hf_hub_download(
23
+ repo_id=f"Goodfire/DeepSeek-R1-SAE-l37",
24
+ filename=f"math/DeepSeek-R1-SAE-l37.pt",
25
+ repo_type="model"
26
+ )
27
+ device = "cpu"
28
+ math_sae = load_math_sae(file_path, device)
29
+ ```
30
+
31
+ The general reasoning SAE was trained on R1’s activations on our [custom
32
+ reasoning dataset](https://huggingface.co/Goodfire/r1-collect), and the second
33
+ used [OpenR1-Math](https://huggingface.co/datasets/open-r1/OpenR1-Math-220k), a
34
+ large dataset for mathematical reasoning. These datasets allow us to discover
35
+ the features that R1 uses to answer challenging problems that exercise its
36
+ reasoning chops.