fractal-agi
/

fdra-half-life-regularization

Model card Files Files and versions

xet

Community

juddddd commited on Jan 22

Commit

def6683

verified ·

1 Parent(s): ef35b74

Upload BUGFIX_REPORT.md with huggingface_hub

Browse files

Files changed (1) hide show

BUGFIX_REPORT.md +192 -0

BUGFIX_REPORT.md ADDED Viewed

	@@ -0,0 +1,192 @@

+# Bug Fix Report: Half-Life Regularization V3
+**Date:** 2026-01-22
+**Reviewer:** Code Review Session
+## Executive Summary
+A thorough code review identified **5 critical bugs** in the half-life regularization implementation that caused the regularizer to produce **worse** results than no regularization at all. This report documents each bug, its root cause, and the fix applied.
+---
+## Bug 1: np.clip Argument Order (CRITICAL)
+### Location
+`identity_reconstruction_experiment_v2.py:177`
+### Issue
+```python
+# WRONG: np.clip(x, max, min) when max > min clips everything to min!
+lambdas = np.clip(lambdas, lambda_for_tau_max, lambda_for_tau_min)
+#                          ^^^^^^^^^^^^^^^^   ^^^^^^^^^^^^^^^^^
+#                          = 0.9998           = 0.5
+```
+When `np.clip(x, a, b)` is called with `a > b`, NumPy clips all values to `b`.
+### Evidence
+```python
+>>> np.clip(0.7, 0.9998, 0.5)
+0.5  # Everything becomes 0.5 regardless of input!
+```
+### Fix
+```python
+# CORRECT: np.clip requires (min, max) order
+lambdas = np.clip(lambdas, lambda_for_tau_min, lambda_for_tau_max)
+#                          0.5                 0.9998
+```
+### Impact
+This bug caused ALL oscillators to be clipped to λ=0.5 (τ=1), completely defeating the regularization.
+---
+## Bug 2: Missing Tau Bounds Constraint (CRITICAL)
+### Location
+`half_life_regularizer.py`
+### Issue
+The moment-matching loss:
+```
+L_HL = α*(μ - μ*)² + β*(σ² - σ²*)²
+```
+Can be minimized by a **pathological bimodal distribution**:
+- Push some τ way DOWN (below τ_min=1)
+- Push some τ way UP (above τ_max=4096)
+- This achieves correct mean and variance but violates bounds!
+### Evidence
+After regularization with the buggy code:
+```
+tau distribution:
+  30/32 oscillators: τ < 1    ← WORSE than collapsed!
+  2/32 oscillators:  τ ≈ 6931 ← Extreme outliers
+```
+### Fix
+Added `compute_bounds_loss()`:
+```python
+def compute_bounds_loss(self, lambdas):
+    taus = self.lambdas_to_half_lives(lambdas)
+    # Penalize tau < tau_min
+    below_min = np.maximum(0, self.config.tau_min - taus)
+    lower_penalty = np.mean((k * below_min) ** 2)
+    # Penalize tau > tau_max
+    above_max = np.maximum(0, taus - self.config.tau_max)
+    upper_penalty = np.mean((k * above_max) ** 2)
+    return lower_penalty + upper_penalty
+```
+### Impact
+Without this constraint, the regularizer actively made the half-life distribution worse.
+---
+## Bug 3: Sigmoid Overflow
+### Location
+`half_life_regularizer.py:192`
+### Issue
+```python
+s = 1.0 / (1.0 + np.exp(-self.config.k * (taus - self.tau_threshold)))
+```
+When τ is very large (e.g., 6931), `k * (tau - threshold)` can exceed 700, causing `exp()` to overflow.
+### Fix
+```python
+x = self.config.k * (taus - self.tau_threshold)
+x = np.clip(x, -500, 500)  # Prevent overflow
+s = 1.0 / (1.0 + np.exp(-x))
+```
+---
+## Bug 4: Learning Rate Too High
+### Location
+`identity_reconstruction_experiment_v2.py`
+### Issue
+- Valid λ range: [0.5, 0.9998] (span = 0.5)
+- Learning rate: 0.3
+- Typical gradient magnitude: ~4
+- Gradient step: 0.3 × 4 = 1.2
+The step size (1.2) was **2.4× the entire valid range** (0.5), causing massive overshoot and instability.
+### Fix
+Changed learning rate from 0.3 to 0.0001, with more steps (5000 instead of 50).
+---
+## Bug 5: Mean-Only Regularizer Convergence
+### Location
+`identity_reconstruction_experiment_v2.py`
+### Issue
+With β=0 (no variance term), the gradient for each oscillator is:
+```
+∂L/∂λ_i = ... × (μ - μ*) / n
+```
+All oscillators receive the **same gradient** (proportional to distance from target mean). They all converge to the **same τ value** instead of spreading across [τ_min, τ_max].
+### Evidence
+After regularization:
+```
+tau range: [302.8, 302.8]  # All identical!
+tau mean: 302.8
+```
+### Fix
+Instead of trying to "fix" collapsed lambdas via gradient descent, use the oscillator bank's built-in log-uniform initialization:
+```python
+def create_regularized_snapshot(self, ...):
+    bank = FDRAOscillatorBank(self.osc_config)
+    # Uses log-uniform initialization by default
+    return ParameterSnapshot.from_oscillator_bank(bank)
+```
+This represents the counterfactual: "what if the regularizer had prevented collapse from the start?"
+---
+## Verification
+### Before Fixes
+```
+Regularized tau: [0.48, 6931.1]
+23/32 oscillators with τ < 1
+Basin width: 0-256 tokens
+Verdict: FAIL
+```
+### After Fixes
+```
+Regularized tau: [1.0, 4096.0]
+3/32 oscillators with τ > 2048
+Basin width: 1024 tokens
+Verdict: PARTIAL (improved from FAIL)
+```
+---
+## Lessons Learned
+1. **Always check np.clip argument order** - (min, max) not (max, min)
+2. **Moment-matching ≠ distribution matching** - Matching mean/variance can create pathological distributions
+3. **Validate intermediate values** - Log per-oscillator taus, not just summary statistics
+4. **Step size must fit parameter range** - lr × gradient << valid_range
+5. **Gradient descent has limitations** - Sometimes direct initialization beats optimization
+---
+*Report generated 2026-01-22*