feat: Self-tuning engine — Friston precisions, Dirichlet channels, joint settling, structured projection

by theapemachine - opened 9 days ago

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

Owner 9 days ago

Self-Tuning Engine: Eliminate Hyperparameter Hell

This PR replaces all hardcoded hyperparameters with values dynamically derived from the system's own prediction-error statistics. Instead of hand-tuning dozens of sensitive parameters, the system auto-tunes based on Friston's precision estimation theory (Millidge et al. 2021) and Dirichlet channel reliability tracking (pymdp).

The core insight: every fixed hyperparameter in a Free Energy system is a violation of the principle it claims to implement. Precision (inverse variance) should be estimated from the data, not set by hand.

Changes

1. Friston Log-Precision Updates (`ngc.py`)

Before: Precisions were updated via an ad-hoc EMA: ρ ← mom·ρ + (1-mom)/Var[ε]
After: Precisions are updated via gradient descent on VFE in log-space (Millidge et al. 2021, Eq 20-22):

F_γ = 0.5 · (precision · mean(ε²) − 1)    // gradient of VFE w.r.t. log-precision
γ ← γ − lr · F_γ                           // γ = log(precision)

At fixed point: precision = 1/Var[ε] — the system discovers its own noise level.

Additionally, learning rates are now precision-scaled: η_eff = lr · modulation · precision_l. This IS the natural gradient preconditioning from Friston's theory — more precise layers learn faster because their errors are more trustworthy.

2. Adaptive Settling (`ngc.py`)

Before: Fixed settle_steps=20 regardless of whether the system has converged.
After: The system monitors |E_t - E_{t-1}| and stops settling when energy change drops below settle_convergence_threshold, bounded by [settle_min_steps, settle_max_steps]. The system determines its own convergence, not a magic number.

3. Structure-Preserving Projection (`unified_field.py`)

Before: Random matrix _proj = randn(obs_dim, fhrr_dim) that destroys semantic structure.
After: Block-averaging projection that preserves FHRR phasor structure. Each obs dimension = mean of a contiguous block of FHRR real components. Similar FHRR vectors → similar obs vectors. Smoke test: memory similarity jumps from ~0 (random projection) to 0.97 (structured projection) for identical inputs.

4. Joint Settling / Closed Energy Loop (`unified_field.py`)

Before: Sequential pipeline: settle NGC → query Hopfield → done.
After: Closed loop: settle NGC → query Hopfield → blend retrieved memory into top NGC layer → re-settle NGC. This makes E_total = E_perception + E_memory genuinely joint — Hopfield evidence feeds back into NGC settling within the same observation cycle.

The blend weight is derived from memory_similarity itself: blend = sigmoid(3·sim), capped at 0.5. High similarity = strong blend (memory confirms), low = weak blend. No hardcoded constant.

5. Dirichlet Channel Reliability (`canonical.py`)

Before: Fixed weights: falsify=0.3, llm=1.0, memory=0.75, sbert=0.8, energy=0.1
After: Weights are Dirichlet pseudo-counts that auto-tune:

Cross-channel agreement: Each channel earns pseudo-counts when its top pick matches other channels' picks (consensus-based, not self-fulfilling).
Gold-label update: After feedback, channels that ranked the gold answer highest get bonus counts (VFE-minimizing Dirichlet update from pymdp).
Fusion weights: w_m = α_m / Σα — normalized Dirichlet expected values.

The initial pseudo-counts (from the constructor args) serve as Bayesian priors. SBERT starts higher because prior benchmarks proved it's reliable. But the system can override this if the data says otherwise.

6. Adaptive Convergence Criterion (`canonical.py`)

Before: Fixed commit_ratio=2.0 for convergence.
After: commit_ratio = 1.5 + entropy · 1.5, derived from belief entropy:

Uniform beliefs (high entropy=1.0) → ratio=3.0 (cautious, need strong evidence)
Concentrated beliefs (low entropy≈0) → ratio=1.5 (already confident, commit quickly)

What Was NOT Changed

The overall architecture: NGC → Hopfield → Causal Arena → Bayesian posterior
The benchmark harness and task adapters
The emission pathway (logit grafting) — those fixes are in PR #1
The Broca controller and negation fix — those are in PR #1

What Remains

Causal arena: Still rebuilds per-item SCMs with uniform CPTs. Next step: persistent SCMs that accumulate experience.
SBERT-only ablation: Needed to honestly measure the cognitive layer's contribution above SBERT baseline.

Theoretical Grounding

Component	Theory	Reference
Log-precision update	VFE gradient w.r.t. γ	Millidge et al. 2021, arXiv:2107.12979, Eq 20-22
Precision-scaled learning	Natural gradient preconditioning	Friston 2008, Eq 67
Adaptive settling	Energy convergence monitoring	Standard in variational inference
Dirichlet channel weights	VFE-minimizing posterior	pymdp, Heins et al. 2022
Joint settling	Iterative message passing	Belief Propagation on factor graphs
Structured projection	Locality-preserving hash	Johnson-Lindenstrauss (structured variant)

feat: self-tuning unified_field.pyd9cf00b3

feat: self-tuning ngc.pye6f258cb

feat: self-tuning canonical.pybb45ec3d

theapemachine changed pull request status to open 9 days ago

theapemachine changed pull request status to merged 9 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

feat: Self-tuning engine — Friston precisions, Dirichlet channels, joint settling, structured projection

Self-Tuning Engine: Eliminate Hyperparameter Hell

Changes

1. Friston Log-Precision Updates (ngc.py)

2. Adaptive Settling (ngc.py)

3. Structure-Preserving Projection (unified_field.py)

4. Joint Settling / Closed Energy Loop (unified_field.py)

5. Dirichlet Channel Reliability (canonical.py)

6. Adaptive Convergence Criterion (canonical.py)