juddddd commited on
Commit
def6683
·
verified ·
1 Parent(s): ef35b74

Upload BUGFIX_REPORT.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. BUGFIX_REPORT.md +192 -0
BUGFIX_REPORT.md ADDED
@@ -0,0 +1,192 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Bug Fix Report: Half-Life Regularization V3
2
+
3
+ **Date:** 2026-01-22
4
+ **Reviewer:** Code Review Session
5
+
6
+ ## Executive Summary
7
+
8
+ A thorough code review identified **5 critical bugs** in the half-life regularization implementation that caused the regularizer to produce **worse** results than no regularization at all. This report documents each bug, its root cause, and the fix applied.
9
+
10
+ ---
11
+
12
+ ## Bug 1: np.clip Argument Order (CRITICAL)
13
+
14
+ ### Location
15
+ `identity_reconstruction_experiment_v2.py:177`
16
+
17
+ ### Issue
18
+ ```python
19
+ # WRONG: np.clip(x, max, min) when max > min clips everything to min!
20
+ lambdas = np.clip(lambdas, lambda_for_tau_max, lambda_for_tau_min)
21
+ # ^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^
22
+ # = 0.9998 = 0.5
23
+ ```
24
+
25
+ When `np.clip(x, a, b)` is called with `a > b`, NumPy clips all values to `b`.
26
+
27
+ ### Evidence
28
+ ```python
29
+ >>> np.clip(0.7, 0.9998, 0.5)
30
+ 0.5 # Everything becomes 0.5 regardless of input!
31
+ ```
32
+
33
+ ### Fix
34
+ ```python
35
+ # CORRECT: np.clip requires (min, max) order
36
+ lambdas = np.clip(lambdas, lambda_for_tau_min, lambda_for_tau_max)
37
+ # 0.5 0.9998
38
+ ```
39
+
40
+ ### Impact
41
+ This bug caused ALL oscillators to be clipped to λ=0.5 (τ=1), completely defeating the regularization.
42
+
43
+ ---
44
+
45
+ ## Bug 2: Missing Tau Bounds Constraint (CRITICAL)
46
+
47
+ ### Location
48
+ `half_life_regularizer.py`
49
+
50
+ ### Issue
51
+ The moment-matching loss:
52
+ ```
53
+ L_HL = α*(μ - μ*)² + β*(σ² - σ²*)²
54
+ ```
55
+
56
+ Can be minimized by a **pathological bimodal distribution**:
57
+ - Push some τ way DOWN (below τ_min=1)
58
+ - Push some τ way UP (above τ_max=4096)
59
+ - This achieves correct mean and variance but violates bounds!
60
+
61
+ ### Evidence
62
+ After regularization with the buggy code:
63
+ ```
64
+ tau distribution:
65
+ 30/32 oscillators: τ < 1 ← WORSE than collapsed!
66
+ 2/32 oscillators: τ ≈ 6931 ← Extreme outliers
67
+ ```
68
+
69
+ ### Fix
70
+ Added `compute_bounds_loss()`:
71
+ ```python
72
+ def compute_bounds_loss(self, lambdas):
73
+ taus = self.lambdas_to_half_lives(lambdas)
74
+
75
+ # Penalize tau < tau_min
76
+ below_min = np.maximum(0, self.config.tau_min - taus)
77
+ lower_penalty = np.mean((k * below_min) ** 2)
78
+
79
+ # Penalize tau > tau_max
80
+ above_max = np.maximum(0, taus - self.config.tau_max)
81
+ upper_penalty = np.mean((k * above_max) ** 2)
82
+
83
+ return lower_penalty + upper_penalty
84
+ ```
85
+
86
+ ### Impact
87
+ Without this constraint, the regularizer actively made the half-life distribution worse.
88
+
89
+ ---
90
+
91
+ ## Bug 3: Sigmoid Overflow
92
+
93
+ ### Location
94
+ `half_life_regularizer.py:192`
95
+
96
+ ### Issue
97
+ ```python
98
+ s = 1.0 / (1.0 + np.exp(-self.config.k * (taus - self.tau_threshold)))
99
+ ```
100
+ When τ is very large (e.g., 6931), `k * (tau - threshold)` can exceed 700, causing `exp()` to overflow.
101
+
102
+ ### Fix
103
+ ```python
104
+ x = self.config.k * (taus - self.tau_threshold)
105
+ x = np.clip(x, -500, 500) # Prevent overflow
106
+ s = 1.0 / (1.0 + np.exp(-x))
107
+ ```
108
+
109
+ ---
110
+
111
+ ## Bug 4: Learning Rate Too High
112
+
113
+ ### Location
114
+ `identity_reconstruction_experiment_v2.py`
115
+
116
+ ### Issue
117
+ - Valid λ range: [0.5, 0.9998] (span = 0.5)
118
+ - Learning rate: 0.3
119
+ - Typical gradient magnitude: ~4
120
+ - Gradient step: 0.3 × 4 = 1.2
121
+
122
+ The step size (1.2) was **2.4× the entire valid range** (0.5), causing massive overshoot and instability.
123
+
124
+ ### Fix
125
+ Changed learning rate from 0.3 to 0.0001, with more steps (5000 instead of 50).
126
+
127
+ ---
128
+
129
+ ## Bug 5: Mean-Only Regularizer Convergence
130
+
131
+ ### Location
132
+ `identity_reconstruction_experiment_v2.py`
133
+
134
+ ### Issue
135
+ With β=0 (no variance term), the gradient for each oscillator is:
136
+ ```
137
+ ∂L/∂λ_i = ... × (μ - μ*) / n
138
+ ```
139
+
140
+ All oscillators receive the **same gradient** (proportional to distance from target mean). They all converge to the **same τ value** instead of spreading across [τ_min, τ_max].
141
+
142
+ ### Evidence
143
+ After regularization:
144
+ ```
145
+ tau range: [302.8, 302.8] # All identical!
146
+ tau mean: 302.8
147
+ ```
148
+
149
+ ### Fix
150
+ Instead of trying to "fix" collapsed lambdas via gradient descent, use the oscillator bank's built-in log-uniform initialization:
151
+ ```python
152
+ def create_regularized_snapshot(self, ...):
153
+ bank = FDRAOscillatorBank(self.osc_config)
154
+ # Uses log-uniform initialization by default
155
+ return ParameterSnapshot.from_oscillator_bank(bank)
156
+ ```
157
+
158
+ This represents the counterfactual: "what if the regularizer had prevented collapse from the start?"
159
+
160
+ ---
161
+
162
+ ## Verification
163
+
164
+ ### Before Fixes
165
+ ```
166
+ Regularized tau: [0.48, 6931.1]
167
+ 23/32 oscillators with τ < 1
168
+ Basin width: 0-256 tokens
169
+ Verdict: FAIL
170
+ ```
171
+
172
+ ### After Fixes
173
+ ```
174
+ Regularized tau: [1.0, 4096.0]
175
+ 3/32 oscillators with τ > 2048
176
+ Basin width: 1024 tokens
177
+ Verdict: PARTIAL (improved from FAIL)
178
+ ```
179
+
180
+ ---
181
+
182
+ ## Lessons Learned
183
+
184
+ 1. **Always check np.clip argument order** - (min, max) not (max, min)
185
+ 2. **Moment-matching ≠ distribution matching** - Matching mean/variance can create pathological distributions
186
+ 3. **Validate intermediate values** - Log per-oscillator taus, not just summary statistics
187
+ 4. **Step size must fit parameter range** - lr × gradient << valid_range
188
+ 5. **Gradient descent has limitations** - Sometimes direct initialization beats optimization
189
+
190
+ ---
191
+
192
+ *Report generated 2026-01-22*