reuAC commited on
Commit
a01bf08
Β·
verified Β·
1 Parent(s): 672259a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +183 -172
README.md CHANGED
@@ -1,172 +1,183 @@
1
- ---
2
- license: mit
3
- language:
4
- - en
5
- - zh
6
- tags:
7
- - transformer
8
- - interpretability
9
- - mechanistic-interpretability
10
- - language-model
11
- - signal-decomposition
12
- - sparse-representations
13
- - pytorch
14
- datasets:
15
- - openwebtext
16
- pipeline_tag: text-generation
17
- ---
18
-
19
- # reFlow
20
-
21
- **A Metal Soul In My Hand** β€” A feature-decoupled Transformer architecture with native interpretability.
22
-
23
- reFlow reconstructs the traditional full-rank embedding matrix into the product of a **Recipe Matrix** $W_{recipe} \in \mathbb{R}^{V \times S}$ and a **Signal Basis Matrix** $W_{basis} \in \mathbb{R}^{S \times d}$, forcing the model to maintain a set of continuous, low-redundancy signal bases in latent space. A dynamic vocabulary matrix $W_{vocab} = W_{recipe} \times W_{basis}$ is reconstructed in real-time at each forward pass, serving simultaneously as both the embedding matrix and the output projection matrix.
24
-
25
- > **Paper**: [English (PDF)](./paper/paper.pdf) | [δΈ­ζ–‡ (PDF)](./paper/paper-cn.pdf)
26
-
27
- ## Project Structure
28
-
29
- ```
30
- reFlow/
31
- β”œβ”€β”€ train.py # Training script (single GPU / DDP)
32
- β”œβ”€β”€ sample.py # Text generation from trained models
33
- β”œβ”€β”€ experiment.py # 12-experiment interpretability suite (Chinese)
34
- β”œβ”€β”€ experiment_en.py # 12-experiment interpretability suite (English)
35
- β”œβ”€β”€ check.py # Checkpoint parameter inspector
36
- β”œβ”€β”€ bench.py # Performance benchmarking
37
- β”œβ”€β”€ models/
38
- β”‚ β”œβ”€β”€ gpt2.py # Standard GPT-2 baseline
39
- β”‚ β”œβ”€β”€ gpt2-new.py # Modernized GPT-2 (RoPE + SwiGLU + RMSNorm)
40
- β”‚ β”œβ”€β”€ reflow.py # reFlow base architecture
41
- β”‚ β”œβ”€β”€ reflow-topk.py # reFlow with ReLU + Top-K hard sparsity
42
- β”‚ └── reflow-lite.py # reFlow with GQA + reduced MLP
43
- β”œβ”€β”€ config/ # Training / sampling / eval configurations
44
- β”œβ”€β”€ data/
45
- β”‚ β”œβ”€β”€ openwebtext/ # OpenWebText dataset preparation
46
- β”‚ └── sft-lima/ # LIMA SFT dataset preparation
47
- └── out/ # Checkpoints and experiment reports
48
- ```
49
-
50
- ## Installation
51
-
52
- ### Prerequisites
53
-
54
- - Python 3.10+
55
- - CUDA-compatible GPU (tested on Tesla T4 x4)
56
-
57
- ### 1. PyTorch (CUDA 12.8)
58
-
59
- ```bash
60
- pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
61
- ```
62
-
63
- > Adjust the CUDA version in the URL to match your driver. See [PyTorch Get Started](https://pytorch.org/get-started/locally/).
64
-
65
- ### 2. Core Dependencies
66
-
67
- ```bash
68
- pip install datasets tiktoken wandb tqdm
69
- ```
70
-
71
- ### 3. Experiment Suite Dependencies
72
-
73
- The interpretability experiments (`experiment.py`) require additional packages:
74
-
75
- ```bash
76
- pip install numpy matplotlib seaborn scikit-learn scipy adjustText
77
- ```
78
-
79
- ### Quick Install (All-in-One)
80
-
81
- ```bash
82
- pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
83
- pip install datasets tiktoken wandb tqdm numpy matplotlib seaborn scikit-learn scipy adjustText
84
- ```
85
-
86
- ## Data Preparation
87
-
88
- ### OpenWebText
89
-
90
- ```bash
91
- python data/openwebtext/prepare.py
92
- ```
93
-
94
- This downloads the OpenWebText corpus (~54 GB) and tokenizes it with the GPT-2 BPE tokenizer. Output: `data/openwebtext/train.bin` (~17 GB, ~9B tokens) and `val.bin`.
95
-
96
- ## Training
97
-
98
- All configurations are in `config/`. No CLI overrides β€” all hyperparameters must be set in the config file.
99
-
100
- ### Single GPU
101
-
102
- ```bash
103
- python train.py config/train_reflow_1.py
104
- ```
105
-
106
- ### Multi-GPU (DDP)
107
-
108
- ```bash
109
- torchrun --standalone --nproc_per_node=4 train.py config/train_reflow_1.py
110
- ```
111
-
112
- ### Available Training Configs
113
-
114
- | Config | Architecture | Layers | Params | Notes |
115
- |--------|-------------|--------|--------|-------|
116
- | `train_gpt2.py` | GPT-2 | 36 | 505.62M | Standard baseline |
117
- | `train_gpt2_new.py` | GPT-2-New | 36 | 514.01M | + RoPE, SwiGLU, RMSNorm |
118
- | `train_reflow_1.py` | reFlow | 32 | 463.67M | Base reFlow, constant lr |
119
- | `train_reflow_1_big.py` | reFlow | 36 | 515.06M | lr decay, for interpretability |
120
- | `train_reflow_1_topk_big.py` | reFlow-TopK | 36 | 515.06M | + ReLU + Top-64 sparsity |
121
- | `train_reflow_1_lite.py` | reFlow-Lite | 32 | 413.34M | + GQA, reduced MLP |
122
- | `train_reflow_1_small.py` | reFlow | 6 | 46.47M | Small-scale validation |
123
-
124
- ### Resume Training
125
-
126
- Append `_resume` to the config name (e.g., `train_reflow_1_big_resume.py`).
127
-
128
- ## Text Generation
129
-
130
- ```bash
131
- python sample.py config/sample_reflow_1.py
132
- ```
133
-
134
- Edit the config file to change the prompt, temperature, top-k, etc.
135
-
136
- ## Interpretability Experiments
137
-
138
- The experiment suite runs 12 analyses on a trained reFlow model. Both Chinese and English versions are available:
139
-
140
- ```bash
141
- python experiment_en.py config/train_reflow_1_big.py # English
142
- python experiment.py config/train_reflow_1_big.py # Chinese
143
- ```
144
-
145
- An interactive menu will appear:
146
-
147
- | # | Experiment | Group |
148
- |---|-----------|-------|
149
- | 1 | Recipe Atlas β€” recipe-space nearest neighbors | A. Signal Identity |
150
- | 2 | Sparsity Profile β€” activation sparsity analysis | A. Signal Identity |
151
- | 3 | Basis Geometry β€” singular value & effective rank | A. Signal Identity |
152
- | 4 | Semantic Galaxy β€” PCA clustering visualization | B. Semantic Properties |
153
- | 5 | Semantic Algebra β€” vector arithmetic (king βˆ’ man + woman = queen) | B. Semantic Properties |
154
- | 6 | Typo Resilience β€” robustness to spelling errors | B. Semantic Properties |
155
- | 7 | Layer Evolution β€” per-layer probability crystallization | C. Mechanistic Analysis |
156
- | 8 | Signal Flow β€” signal activation heatmaps across layers | C. Mechanistic Analysis |
157
- | 9 | Causal Ablation β€” progressive signal knockout curves | C. Mechanistic Analysis |
158
- | 10 | Emotion Surgery β€” sentiment steering via signal injection | D. Control & Steering |
159
- | 11 | Concept Inception β€” binary-search concept implantation | D. Control & Steering |
160
- | 12 | Genetic Hijack β€” global recipe matrix manipulation | D. Control & Steering |
161
-
162
- Enter `all` to run all experiments, or specific numbers (e.g., `1 3 5`). Reports are saved to `out/<model>/audit_reports/`.
163
-
164
- ## Checkpoint Inspection
165
-
166
- ```bash
167
- python check.py config/train_reflow_1.py out/reflow-1/ckpt.pt
168
- ```
169
-
170
- ## License
171
-
172
- MIT License. Based on [nanoGPT](https://github.com/karpathy/nanoGPT) by Andrej Karpathy.
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ - zh
6
+ tags:
7
+ - transformer
8
+ - interpretability
9
+ - mechanistic-interpretability
10
+ - language-model
11
+ - signal-decomposition
12
+ - sparse-representations
13
+ - pytorch
14
+ datasets:
15
+ - openwebtext
16
+ pipeline_tag: text-generation
17
+ ---
18
+
19
+ # reFlow
20
+
21
+ **A Metal Soul In My Hand** β€” A feature-decoupled Transformer architecture with native interpretability.
22
+
23
+ reFlow reconstructs the traditional full-rank embedding matrix into the product of a **Recipe Matrix** $W_{recipe} \in \mathbb{R}^{V \times S}$ and a **Signal Basis Matrix** $W_{basis} \in \mathbb{R}^{S \times d}$, forcing the model to maintain a set of continuous, low-redundancy signal bases in latent space. A dynamic vocabulary matrix $W_{vocab} = W_{recipe} \times W_{basis}$ is reconstructed in real-time at each forward pass, serving simultaneously as both the embedding matrix and the output projection matrix.
24
+
25
+ > **Paper**: [English (PDF)](./paper/paper.pdf) | [δΈ­ζ–‡ (PDF)](./paper/paper-cn.pdf)
26
+
27
+ ## Project Structure
28
+
29
+ ```
30
+ reFlow/
31
+ β”œβ”€β”€ train.py # Training script (single GPU / DDP)
32
+ β”œβ”€β”€ sample.py # Text generation from trained models
33
+ β”œβ”€β”€ experiment.py # 12-experiment interpretability suite (Chinese)
34
+ β”œβ”€β”€ experiment_en.py # 12-experiment interpretability suite (English)
35
+ β”œβ”€β”€ check.py # Checkpoint parameter inspector
36
+ β”œβ”€β”€ bench.py # Performance benchmarking
37
+ β”œβ”€β”€ models/
38
+ β”‚ β”œβ”€β”€ gpt2.py # Standard GPT-2 baseline
39
+ β”‚ β”œβ”€β”€ gpt2-new.py # Modernized GPT-2 (RoPE + SwiGLU + RMSNorm)
40
+ β”‚ β”œβ”€β”€ reflow.py # reFlow base architecture
41
+ β”‚ β”œβ”€β”€ reflow-topk.py # reFlow with ReLU + Top-K hard sparsity
42
+ β”‚ └── reflow-lite.py # reFlow with GQA + reduced MLP
43
+ β”œβ”€β”€ config/ # Training / sampling / eval configurations
44
+ β”œβ”€β”€ data/
45
+ β”‚ β”œβ”€β”€ openwebtext/ # OpenWebText dataset preparation
46
+ β”‚ └── sft-lima/ # LIMA SFT dataset preparation
47
+ └── out/ # Checkpoints and experiment reports
48
+ ```
49
+
50
+ ## Installation
51
+
52
+ ### Prerequisites
53
+
54
+ - Python 3.10+
55
+ - CUDA-compatible GPU (tested on Tesla T4 x4)
56
+
57
+ ### 1. PyTorch (CUDA 12.8)
58
+
59
+ ```bash
60
+ pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
61
+ ```
62
+
63
+ > Adjust the CUDA version in the URL to match your driver. See [PyTorch Get Started](https://pytorch.org/get-started/locally/).
64
+
65
+ ### 2. Core Dependencies
66
+
67
+ ```bash
68
+ pip install datasets tiktoken wandb tqdm
69
+ ```
70
+
71
+ ### 3. Experiment Suite Dependencies
72
+
73
+ The interpretability experiments (`experiment.py`) require additional packages:
74
+
75
+ ```bash
76
+ pip install numpy matplotlib seaborn scikit-learn scipy adjustText
77
+ ```
78
+
79
+ ### Quick Install (All-in-One)
80
+
81
+ ```bash
82
+ pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
83
+ pip install datasets tiktoken wandb tqdm numpy matplotlib seaborn scikit-learn scipy adjustText
84
+ ```
85
+
86
+ ## Data Preparation
87
+
88
+ ### OpenWebText
89
+
90
+ ```bash
91
+ python data/openwebtext/prepare.py
92
+ ```
93
+
94
+ This downloads the OpenWebText corpus (~54 GB) and tokenizes it with the GPT-2 BPE tokenizer. Output: `data/openwebtext/train.bin` (~17 GB, ~9B tokens) and `val.bin`.
95
+
96
+ ## Training
97
+
98
+ All configurations are in `config/`. No CLI overrides β€” all hyperparameters must be set in the config file.
99
+
100
+ ### Single GPU
101
+
102
+ ```bash
103
+ python train.py config/train_reflow_1.py
104
+ ```
105
+
106
+ ### Multi-GPU (DDP)
107
+
108
+ ```bash
109
+ torchrun --standalone --nproc_per_node=4 train.py config/train_reflow_1.py
110
+ ```
111
+
112
+ ### Available Training Configs
113
+
114
+ | Config | Architecture | Layers | Params | Notes |
115
+ |--------|-------------|--------|--------|-------|
116
+ | `train_gpt2.py` | GPT-2 | 36 | 505.62M | Standard baseline |
117
+ | `train_gpt2_new.py` | GPT-2-New | 36 | 514.01M | + RoPE, SwiGLU, RMSNorm |
118
+ | `train_reflow_1.py` | reFlow | 32 | 463.67M | Base reFlow, constant lr |
119
+ | `train_reflow_1_big.py` | reFlow | 36 | 515.06M | lr decay, for interpretability |
120
+ | `train_reflow_1_topk_big.py` | reFlow-TopK | 36 | 515.06M | + ReLU + Top-64 sparsity |
121
+ | `train_reflow_1_lite.py` | reFlow-Lite | 32 | 413.34M | + GQA, reduced MLP |
122
+ | `train_reflow_1_small.py` | reFlow | 6 | 46.47M | Small-scale validation |
123
+
124
+ ### Resume Training
125
+
126
+ Append `_resume` to the config name (e.g., `train_reflow_1_big_resume.py`).
127
+
128
+ ## Text Generation
129
+
130
+ ```bash
131
+ python sample.py config/sample_reflow_1.py
132
+ ```
133
+
134
+ Edit the config file to change the prompt, temperature, top-k, etc.
135
+
136
+ ## Interpretability Experiments
137
+
138
+ The experiment suite runs 12 analyses on a trained reFlow model. Both Chinese and English versions are available:
139
+
140
+ ```bash
141
+ python experiment_en.py config/train_reflow_1_big.py # English
142
+ python experiment.py config/train_reflow_1_big.py # Chinese
143
+ ```
144
+
145
+ An interactive menu will appear:
146
+
147
+ | # | Experiment | Group |
148
+ |---|-----------|-------|
149
+ | 1 | Recipe Atlas β€” recipe-space nearest neighbors | A. Signal Identity |
150
+ | 2 | Sparsity Profile β€” activation sparsity analysis | A. Signal Identity |
151
+ | 3 | Basis Geometry β€” singular value & effective rank | A. Signal Identity |
152
+ | 4 | Semantic Galaxy β€” PCA clustering visualization | B. Semantic Properties |
153
+ | 5 | Semantic Algebra β€” vector arithmetic (king βˆ’ man + woman = queen) | B. Semantic Properties |
154
+ | 6 | Typo Resilience β€” robustness to spelling errors | B. Semantic Properties |
155
+ | 7 | Layer Evolution β€” per-layer probability crystallization | C. Mechanistic Analysis |
156
+ | 8 | Signal Flow β€” signal activation heatmaps across layers | C. Mechanistic Analysis |
157
+ | 9 | Causal Ablation β€” progressive signal knockout curves | C. Mechanistic Analysis |
158
+ | 10 | Emotion Surgery β€” sentiment steering via signal injection | D. Control & Steering |
159
+ | 11 | Concept Inception β€” binary-search concept implantation | D. Control & Steering |
160
+ | 12 | Genetic Hijack β€” global recipe matrix manipulation | D. Control & Steering |
161
+
162
+ Enter `all` to run all experiments, or specific numbers (e.g., `1 3 5`). Reports are saved to `out/<model>/audit_reports/`.
163
+
164
+ ## Checkpoint Inspection
165
+
166
+ ```bash
167
+ python check.py config/train_reflow_1.py out/reflow-1/ckpt.pt
168
+ ```
169
+
170
+ ## License
171
+
172
+ MIT License. Based on [nanoGPT](https://github.com/karpathy/nanoGPT) by Andrej Karpathy.
173
+
174
+ ```bibtex
175
+ @misc{reuac_2026,
176
+ author = { reuAC },
177
+ title = { reFlow (Revision 672259a) },
178
+ year = 2026,
179
+ url = { https://huggingface.co/reuAC/reFlow },
180
+ doi = { 10.57967/hf/8047 },
181
+ publisher = { Hugging Face }
182
+ }
183
+ ```