LoganResearch commited on
Commit
3bdb9c3
·
verified ·
1 Parent(s): 7813639

Delete README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -180
README.md DELETED
@@ -1,180 +0,0 @@
1
- # Lie-Holonomy Transformer (LHT)
2
-
3
- A PyTorch implementation of the gauge-theoretic reasoning architecture from "Beyond Holonomy: Lie-Algebraic Symbol Emergence and the Homotopy Type Structure of Neural Reasoning."
4
-
5
- ## Core Ideas
6
-
7
- This architecture treats **reasoning as geometry**:
8
-
9
- | Concept | Mathematical Structure | Implementation |
10
- |---------|----------------------|----------------|
11
- | Propositions | Manifold M | Embedding space |
12
- | Inference | Parallel transport | Gauge-covariant attention |
13
- | Consistency | Holonomy = Identity | Holonomy loss |
14
- | Symbols | Lie algebra generators | Generator network |
15
- | Proof equivalence | Homotopy | Layer depth |
16
-
17
- ## Architecture Overview
18
-
19
- ```
20
- Input tokens
21
-
22
-
23
- ┌─────────────────────────────────────┐
24
- │ Token Embedding (Proposition M) │
25
- │ + Position Embedding │
26
- │ + Fiber Initialization (gauge) │
27
- └─────────────────────────────────────┘
28
-
29
-
30
- ┌─────────────────────────────────────┐
31
- │ LHT Layer (× n_layers) │
32
- │ ┌─────────────────────────────┐ │
33
- │ │ Connection Network A(x) │ │ ← Learns gauge connection
34
- │ │ Parallel Transport Γ_{j→i} │ │ ← Transports fiber elements
35
- │ │ Gauge-Covariant Attention │ │ ← Modified self-attention
36
- │ │ Lie Algebra Generator │ │ ← Generates inference ops
37
- │ │ Generator Application │ │ ← Applies exp(X) to fiber
38
- │ └─────────────────────────────┘ │
39
- └─────────────────────────────────────┘
40
-
41
-
42
- ┌─────────────────────────────────────┐
43
- │ Output: logits + geometric losses │
44
- └─────────────────────────────────────┘
45
- ```
46
-
47
- ## Key Components
48
-
49
- ### 1. Connection Network
50
- Learns the gauge connection ω that defines how to parallel transport inferential states:
51
- ```python
52
- A_μ(x) ∈ gl(k,ℝ) # Lie algebra valued 1-form
53
- ```
54
-
55
- ### 2. Parallel Transport
56
- Computes transport operators between positions:
57
- ```python
58
- Γ_{j→i} = exp(-A_μ(x_j)(x_i - x_j)^μ)
59
- ```
60
-
61
- ### 3. Gauge-Covariant Attention
62
- Standard attention with parallel transport of values:
63
- ```python
64
- # Standard: Attn(Q,K,V)_i = Σ_j α_ij V_j
65
- # Gauge: GaugeAttn_i = Σ_j α_ij Γ_{j→i}(V_j)
66
- ```
67
-
68
- ### 4. Holonomy Loss
69
- Enforces reasoning consistency by requiring closed loops to return to identity:
70
- ```python
71
- L_hol = E[||Hol_γ - I||²_F]
72
- ```
73
-
74
- ### 5. Curvature Regularization
75
- Encourages flat reasoning spaces where order doesn't matter:
76
- ```python
77
- L_curv = E[||F(x)||²_F] where F = dω + ω∧ω
78
- ```
79
-
80
- ## Installation
81
-
82
- ```bash
83
- pip install torch
84
- ```
85
-
86
- ## Usage
87
-
88
- ### Basic
89
- ```python
90
- from lht import LieHolonomyTransformer, LHTConfig
91
-
92
- # Create model
93
- config = LHTConfig(
94
- vocab_size=32000,
95
- d_model=512,
96
- d_fiber=64,
97
- n_heads=8,
98
- n_layers=6,
99
- lie_algebra_rank=8,
100
- )
101
- model = LieHolonomyTransformer(config)
102
-
103
- # Forward pass
104
- output = model(
105
- input_ids=tokens,
106
- labels=labels,
107
- return_geometric_losses=True
108
- )
109
-
110
- # Get losses
111
- lm_loss = output['lm_loss']
112
- holonomy_loss = output['holonomy_loss']
113
- curvature_loss = output['curvature_loss']
114
- total_loss = model.get_total_loss(output)
115
- ```
116
-
117
- ### Training with Geometric Loss Annealing
118
- ```python
119
- from lht import LHTTrainer
120
-
121
- trainer = LHTTrainer(model, optimizer, config)
122
-
123
- for batch in dataloader:
124
- metrics = trainer.train_step(batch)
125
- # Early training: high curvature loss → flat representations
126
- # Mid training: high holonomy loss → consistency
127
- # Late training: high waypoint loss → discrete structure
128
- ```
129
-
130
- ### Waypoint Detection
131
- ```python
132
- from lht import WaypointDetector
133
-
134
- detector = WaypointDetector(config, n_waypoints=32)
135
- waypoint_ids, stability = detector(representations)
136
- ```
137
-
138
- ## Configuration
139
-
140
- | Parameter | Description | Default |
141
- |-----------|-------------|---------|
142
- | `d_model` | Proposition manifold dimension | 512 |
143
- | `d_fiber` | Fiber (gauge) dimension | 64 |
144
- | `lie_algebra_rank` | k for GL(k,ℝ) structure group | 8 |
145
- | `lambda_holonomy` | Weight for holonomy loss | 0.1 |
146
- | `lambda_curvature` | Weight for curvature loss | 0.01 |
147
- | `lambda_waypoint` | Weight for waypoint stability | 0.05 |
148
-
149
- ## Theoretical Predictions
150
-
151
- The framework makes testable predictions:
152
-
153
- 1. **Chain-of-thought benefit correlates with curvature** - High-curvature domains (causal reasoning) benefit more from CoT than low-curvature domains (arithmetic)
154
-
155
- 2. **Waypoints emerge spontaneously** - Training with holonomy loss should cause discrete symbol-like structures to form at flat loci
156
-
157
- 3. **Holonomy predicts errors** - Incorrect reasoning paths should have higher holonomy magnitude
158
-
159
- 4. **Compositional generalization improves** - Holonomy constraints force consistent composition
160
-
161
- ## File Structure
162
-
163
- ```
164
- lie_holonomy_transformer/
165
- ├── lht.py # Core implementation
166
- ├── train.py # Training script
167
- ├── README.md # This file
168
- └── experiments/ # Benchmark code (TODO)
169
- ```
170
-
171
- ## References
172
-
173
- - "Beyond Holonomy: Lie-Algebraic Symbol Emergence..." (the paper)
174
- - Cohen et al. (2019). Gauge Equivariant Convolutional Networks
175
- - Weiler & Cesa (2019). General E(2)-Equivariant Steerable CNNs
176
- - The Univalent Foundations Program (2013). Homotopy Type Theory
177
-
178
- ## License
179
-
180
- MIT