reFlow
[ δΈζ | English ]
A Metal Soul In My Hand β A feature-decoupled Transformer architecture with native interpretability.
reFlow factorizes the embedding matrix $E \in \mathbb{R}^{V \times d}$ into a Recipe Matrix $W_{recipe} \in \mathbb{R}^{V \times S}$ and a Signal Basis Matrix $W_{basis} \in \mathbb{R}^{S \times d}$, forcing the model to maintain a set of continuous, low-redundancy signal bases in latent space. The same factored product $W_{recipe} \times W_{basis}$ serves as both the input embedding and the output projection, forming an end-to-end signal-manifold computation loop without a separate LM head.
Online Demo
Try reFlow in your browser:
- HuggingFace Space (Global Access)
- ModelScope Studio (China Access)
Key Results
Convergence. At matched depth and scale (36 layers, ~515M parameters), reFlow-1-Big achieves a validation loss within ~1% of GPT-2-New (514M). Three scale points β Small (46.47M), reFlow-1 (463.67M), Big (515.06M) β confirm strict scaling law compliance (val loss: 3.55 β 3.01 β 2.92).
Emergent Interpretable Structure (pure language modeling objective, no auxiliary loss):
- Recipe-space semantic algebra: king + woman β man β queen (rank #1), 3/3 tests passed
- Natural sparsity: each token activates ~11% of signals (mean 117/1024), Gini coefficient 0.085
- Causal traceability: single-signal ablation collapses target probability from 8.31% to 0.03%
- Information crystallization boundary: semantic interventions are effective at L0βL12 but inert beyond L18
- Hard sparsity (Top-64) systematically destroys recipe-space semantic structure (algebra 3/3 β 0/3, silhouette +0.11 β β0.02)
Paper: English (PDF) | δΈζ (PDF) β Theoretical derivation, 12 interpretability experiments, and scaling/ablation analysis.
Pretrained Weights: HuggingFace
Project Structure
reFlow/
βββ train.py # Training script (single GPU / DDP)
βββ sample.py # Text generation from trained models
βββ experiment.py # 12-experiment interpretability suite (Chinese)
βββ experiment_en.py # 12-experiment interpretability suite (English)
βββ check.py # Checkpoint parameter inspector
βββ bench.py # Performance benchmarking
βββ models/
β βββ gpt2.py # Standard GPT-2 baseline
β βββ gpt2-new.py # Modernized GPT-2 (RoPE + SwiGLU + RMSNorm)
β βββ reflow.py # reFlow base architecture
β βββ reflow-topk.py # reFlow with ReLU + Top-K hard sparsity
β βββ reflow-lite.py # reFlow with GQA + reduced MLP
βββ config/ # Training / sampling / eval configurations
βββ data/
β βββ openwebtext/ # OpenWebText dataset preparation
β βββ sft-lima/ # LIMA SFT dataset preparation
βββ out/ # Checkpoints and experiment reports
Installation
Prerequisites
- Python 3.10+
- CUDA-compatible GPU (tested on Tesla T4 x4)
1. PyTorch (CUDA 12.8)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
Adjust the CUDA version in the URL to match your driver. See PyTorch Get Started.
2. Core Dependencies
pip install datasets tiktoken wandb tqdm
3. Experiment Suite Dependencies
The interpretability experiments (experiment.py) require additional packages:
pip install numpy matplotlib seaborn scikit-learn scipy adjustText
Quick Install (All-in-One)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
pip install datasets tiktoken wandb tqdm numpy matplotlib seaborn scikit-learn scipy adjustText
Data Preparation
OpenWebText
python data/openwebtext/prepare.py
This downloads the OpenWebText corpus (54 GB) and tokenizes it with the GPT-2 BPE tokenizer. Output: 17 GB, ~9B tokens) and data/openwebtext/train.bin (val.bin.
Training
All configurations are in config/. No CLI overrides β all hyperparameters must be set in the config file.
Single GPU
python train.py config/train_reflow_1.py
Multi-GPU (DDP)
torchrun --standalone --nproc_per_node=4 train.py config/train_reflow_1.py
Available Training Configs
| Config | Architecture | Layers | Params | Notes |
|---|---|---|---|---|
train_gpt2.py |
GPT-2 | 36 | 505.62M | Standard baseline |
train_gpt2_new.py |
GPT-2-New | 36 | 514.01M | + RoPE, SwiGLU, RMSNorm |
train_reflow_1.py |
reFlow | 32 | 463.67M | Base reFlow, constant lr |
train_reflow_1_big.py |
reFlow | 36 | 515.06M | lr decay, for interpretability |
train_reflow_1_topk_big.py |
reFlow-TopK | 36 | 515.06M | + ReLU + Top-64 sparsity |
train_reflow_1_lite.py |
reFlow-Lite | 32 | 413.34M | + GQA, reduced MLP |
train_reflow_1_small.py |
reFlow | 6 | 46.47M | Small-scale validation |
Resume Training
Append _resume to the config name (e.g., train_reflow_1_big_resume.py).
Text Generation
python sample.py config/sample_reflow_1.py
Edit the config file to change the prompt, temperature, top-k, etc.
Interpretability Experiments
The experiment suite runs 12 analyses on a trained reFlow model. Both Chinese and English versions are available:
python experiment_en.py config/train_reflow_1_big.py # English
python experiment.py config/train_reflow_1_big.py # Chinese
An interactive menu will appear:
| # | Experiment | Group |
|---|---|---|
| 1 | Recipe Atlas β recipe-space nearest neighbors | A. Signal Identity |
| 2 | Sparsity Profile β activation sparsity analysis | A. Signal Identity |
| 3 | Basis Geometry β singular value & effective rank | A. Signal Identity |
| 4 | Semantic Galaxy β PCA clustering visualization | B. Semantic Properties |
| 5 | Semantic Algebra β vector arithmetic (king β man + woman = queen) | B. Semantic Properties |
| 6 | Typo Resilience β robustness to spelling errors | B. Semantic Properties |
| 7 | Layer Evolution β per-layer probability crystallization | C. Mechanistic Analysis |
| 8 | Signal Flow β signal activation heatmaps across layers | C. Mechanistic Analysis |
| 9 | Causal Ablation β progressive signal knockout curves | C. Mechanistic Analysis |
| 10 | Emotion Surgery β sentiment steering via signal injection | D. Control & Steering |
| 11 | Concept Inception β binary-search concept implantation | D. Control & Steering |
| 12 | Genetic Hijack β global recipe matrix manipulation | D. Control & Steering |
Enter all to run all experiments, or specific numbers (e.g., 1 3 5). Reports are saved to out/<model>/audit_reports/.
Checkpoint Inspection
python check.py config/train_reflow_1.py out/reflow-1/ckpt.pt
License
MIT License. Based on nanoGPT by Andrej Karpathy.