fix: Critique response β€” logit fluency, causal pruning, FHRR phase cancellation

#6
by theapemachine - opened

Critique Response: Logit graft fluency, causal arena pruning, FHRR phase cancellation

Three fixes from the audio review critique.

1. Logit graft: dynamic temperature scaling replaces hard -inf (logit_bias.py)

Before: Eliminated hypothesis tokens were suppressed to -inf, which "violently collides with the model's syntactic expectations" β€” the LLM can't use structurally necessary tokens (pronouns, conjunctions) even when grammar demands them.

After: Suppressed tokens get -max_bias (default -8.0) instead of -inf. This makes them very unlikely but not impossible. If the LLM's autoregressive prior strongly demands a suppressed token for grammatical coherence, it can still use it β€” the cognitive constraint shapes the distribution without breaking fluency.

Same fix applied to StaticLogitBiasBuilder for the remote API path.

2. Causal arena: structural pruning + DAG merging (arena.py)

Before: Every proposed SCM was registered as a separate competing model, even if structurally identical to an existing one. This risked combinatorial explosion of the 250,000 counterfactual world limit.

After:

  • register_model() now checks for structural equivalence (same variables, same edges) before registration. Duplicate DAGs are merged by averaging their Dirichlet pseudocounts instead of adding a separate model.
  • compete() adds an energy threshold filter: models whose single-step log-likelihood is >20 nats worse than the best are flagged for fast elimination, avoiding wasted counterfactual computation.

3. FHRR: sparse block coding + hierarchical temporal bundling (fhrr.py)

Before: bundle() added all complex phasors and normalized. With dense SBERT-grounded embeddings, bundling >20 vectors caused phase wrapping β€” the "superposition catastrophe" where constituent meanings wash out into noise.

After:

  • bundle() accepts optional top_k parameter: when set, only the top_k dimensions with largest magnitude are preserved per vector before addition. This induces sparsity that prevents phase wrapping.
  • encode_sequence() uses hierarchical temporal bundling: tokens are bundled within local windows (default 16), then window summaries are bundled with sparse top_k. Short sequences still use direct bundling. This preserves high-resolution semantic detail within recent context while summarizing distant tokens.
theapemachine changed pull request status to open
theapemachine changed pull request status to merged

Sign up or log in to comment