Codette-Reasoning / ADAPTER_ROUTER_INTEGRATION.md
Raiff1982's picture
Upload 78 files
d574a3d verified

AdapterRouter Integration Guide: Memory-Weighted Routing

Overview

This guide shows how to integrate Phase 2's MemoryWeighting into the actual AdapterRouter to enable adaptive adapter selection based on historical performance.

Current State: MemoryWeighting is built and wired into ForgeEngine, but not yet connected to AdapterRouter. This document bridges that gap.


Architecture: Where MemoryWeighting Fits

Query
  ↓
AdapterRouter.route()
  β”œβ”€ [Current] Keyword matching β†’ base_result = RouteResult(primary, secondary, confidence)
  └─ [Phase 2] Memory-weighted boost β†’ boosted_confidence = base_confidence * (1 + weight_modifier)
  ↓
ForgeEngine.forge_with_debate(primary=primary_adapter, secondary=secondary_adapters)
  ↓
Agents generate analyses β†’ Conflicts detected β†’ Stored in memory
  ↓
Next Query: Adapters with high historical coherence get +50% confidence boost

Integration Steps

Step 1: Wire MemoryWeighting into AdapterRouter.init()

File: inference/adapter_router.py (lines ~50-80)

Current Code:

class AdapterRouter:
    def __init__(self, adapter_registry):
        self.adapter_registry = adapter_registry
        self.keyword_index = {}
        # ... initialize other components ...

Phase 2 Enhancement:

from reasoning_forge.memory_weighting import MemoryWeighting

class AdapterRouter:
    def __init__(self, adapter_registry, memory_weighting=None):
        self.adapter_registry = adapter_registry
        self.keyword_index = {}
        self.memory_weighting = memory_weighting  # NEW: optional memory weighting
        # ... initialize other components ...

Usage:

# In codette_session.py or app initialization:
from reasoning_forge.living_memory import LivingMemoryKernel
from reasoning_forge.memory_weighting import MemoryWeighting
from inference.adapter_router import AdapterRouter

memory = LivingMemoryKernel(max_memories=100)
weighting = MemoryWeighting(memory)
router = AdapterRouter(adapter_registry, memory_weighting=weighting)

Step 2: Modify AdapterRouter.route() for Memory-Weighted Boost

File: inference/adapter_router.py (lines ~200-250)

Current Code:

def route(self, query: str) -> RouteResult:
    """Route query to appropriate adapters."""
    # Keyword matching
    scores = self._route_keyword(query)

    return RouteResult(
        primary=best_adapter,
        secondary=top_secondary,
        confidence=max_score
    )

Phase 2 Enhancement - SOFT BOOST:

def route(self, query: str, use_memory_boost: bool = True) -> RouteResult:
    """Route query to appropriate adapters with optional memory weighting.

    Args:
        query: User query text
        use_memory_boost: If True, boost confidence based on historical performance

    Returns:
        RouteResult with primary, secondary adapters and confidence
    """
    # Step 1: Keyword-based routing (existing logic)
    base_result = self._route_keyword(query)

    # Step 2: Apply memory-weighted boost (Phase 2)
    if use_memory_boost and self.memory_weighting:
        boosted_conf = self.memory_weighting.get_boosted_confidence(
            base_result.primary,
            base_result.confidence
        )
        base_result.confidence = boosted_conf

        # Optional: Explain the boost for debugging
        if os.environ.get("DEBUG_ADAPTER_ROUTING"):
            explanation = self.memory_weighting.explain_weight(base_result.primary)
            print(f"[ROUTING] {base_result.primary}: "
                  f"base={base_result.confidence:.2f}, "
                  f"boosted={boosted_conf:.2f}, "
                  f"weight={explanation['final_weight']:.2f}")

    return base_result

Advanced Option - STRICT MEMORY-ONLY (optional, higher risk):

def route(self, query: str, strategy: str = "keyword") -> RouteResult:
    """Route query with pluggable strategy.

    Args:
        query: User query text
        strategy: "keyword" (default), "memory_weighted", or "memory_only"

    Returns:
        RouteResult with primary, secondary adapters and confidence
    """
    if strategy == "memory_only" and self.memory_weighting:
        # Pure learning approach: ignore keywords
        weights = self.memory_weighting.compute_weights()
        if weights:
            primary = max(weights.keys(), key=lambda a: weights[a])
            return RouteResult(
                primary=primary,
                secondary=[],  # No secondary adapters in memory-only mode
                confidence=weights[primary] / 2.0  # Normalize [0, 1]
            )
        else:
            # Fallback to keyword if no memory yet
            return self._route_keyword(query)

    elif strategy == "memory_weighted":
        # Soft boost approach: keyword routing + memory confidence boost
        base_result = self._route_keyword(query)
        if self.memory_weighting:
            boosted_conf = self.memory_weighting.get_boosted_confidence(
                base_result.primary,
                base_result.confidence
            )
            base_result.confidence = boosted_conf
        return base_result

    else:  # strategy == "keyword"
        # Pure keyword routing (existing behavior)
        return self._route_keyword(query)

Step 3: Pass MemoryWeighting Through Session/App

File: inference/codette_session.py (lines ~50-100)

Current Code:

class CodetteSession:
    def __init__(self):
        self.memory_kernel = LivingMemoryKernel(max_memories=100)
        self.router = AdapterRouter(adapter_registry)
        self.forge = ForgeEngine()

Phase 2 Enhancement:

from reasoning_forge.memory_weighting import MemoryWeighting

class CodetteSession:
    def __init__(self):
        self.memory_kernel = LivingMemoryKernel(max_memories=100)

        # NEW: Initialize memory weighting
        self.memory_weighting = MemoryWeighting(self.memory_kernel)

        # Wire into router
        self.router = AdapterRouter(
            adapter_registry,
            memory_weighting=self.memory_weighting
        )

        # Wire into forge (Phase 2)
        self.forge = ForgeEngine(
            living_memory=self.memory_kernel,
            enable_memory_weighting=True
        )

    def on_submit(self, query: str):
        """Process user query with memory-weighted routing."""
        # Route using memory weights
        route_result = self.router.route(query, use_memory_boost=True)

        # Run forge with memory enabled
        result = self.forge.forge_with_debate(query)

        # Conflicts automatically stored in memory
        response = result["metadata"]["synthesized"]

        return response

Testing the Integration

Unit Test: Memory Weighting + Router

def test_memory_weighted_routing():
    """Test that memory weights modulate router confidence."""
    from reasoning_forge.living_memory import LivingMemoryKernel, MemoryCocoon
    from reasoning_forge.memory_weighting import MemoryWeighting
    from inference.adapter_router import AdapterRouter

    # Setup
    memory = LivingMemoryKernel()

    # Seed memory with Newton performance (high coherence)
    newton_cocoon = MemoryCocoon(
        title="Newton analysis",
        content="Analytical approach",
        adapter_used="newton",
        coherence=0.9,
        emotional_tag="neutral",
    )
    memory.store(newton_cocoon)

    # Create weighting + router
    weighting = MemoryWeighting(memory)
    router = AdapterRouter(adapter_registry, memory_weighting=weighting)

    # Test
    query = "Analyze this algorithm"
    result = router.route(query, use_memory_boost=True)

    # If Newton scored high before, its confidence should be boosted
    assert result.confidence > 0.5  # Baseline
    print(f"βœ“ Routing test passed: {result.primary} @ {result.confidence:.2f}")

E2E Test: Full Loop

def test_memory_learning_loop():
    """Test that conflicts β†’ memory β†’ weights β†’ better future routing."""
    from reasoning_forge.forge_engine import ForgeEngine
    from reasoning_forge.living_memory import LivingMemoryKernel
    from reasoning_forge.memory_weighting import MemoryWeighting
    from inference.adapter_router import AdapterRouter

    # Run 1: Initial debate (no memory history)
    memory = LivingMemoryKernel()
    forge = ForgeEngine(living_memory=memory, enable_memory_weighting=True)

    result1 = forge.forge_with_debate("Compare speed vs clarity", debate_rounds=1)
    conflicts1 = result1["metadata"]["conflicts_round_0_count"]
    print(f"Run 1: {conflicts1} conflicts detected, stored in memory")

    # Run 2: Same query with memory history
    # Adapters that resolved conflicts should get boosted
    weighting = MemoryWeighting(memory)  # Now has history
    weights = weighting.get_all_weights()

    print(f"\nAdapter weights after learning:")
    for adapter, w_dict in weights.items():
        print(f"  {adapter}: weight={w_dict['weight']:.3f}, coherence={w_dict['coherence']:.3f}")

    # Router should now boost high-performing adapters
    router = AdapterRouter(adapter_registry, memory_weighting=weighting)
    route_result = router.route("Compare speed vs clarity", use_memory_boost=True)
    print(f"\nRouting decision: {route_result.primary} @ {route_result.confidence:.2f}")

    # Run debate again (should use boosted adapters)
    result2 = forge.forge_with_debate("Compare speed vs clarity", debate_rounds=1)
    conflicts2 = result2["metadata"]["conflicts_round_0_count"]

    # Measure improvement
    improvement = (conflicts1 - conflicts2) / max(conflicts1, 1)
    print(f"Run 2: {conflicts2} conflicts (improvement: {improvement:.1%})")

Configuration: Tuning Parameters

Memory Weighting Parameters (in MemoryWeighting):

# Update frequency (hours)
update_interval_hours = 1.0  # Recompute weights every hour

# Weight formula contributions
base_coherence_weight = 0.5    # Contribution from mean coherence
conflict_success_weight = 0.3  # Contribution from conflict resolution
recency_weight = 0.2           # Contribution from recency decay

# Recency decay half-life (hours)
recency_half_life_hours = 168  # 7 days

# Boost modulation
max_boost = 0.5                # Β±50% confidence modification

Router Integration Options:

# Memory boost enabled/disabled
router.route(query, use_memory_boost=True)   # Default: enabled
router.route(query, use_memory_boost=False)  # Keyword-only

# Strategy selection (advanced)
router.route(query, strategy="keyword")          # Pure keyword
router.route(query, strategy="memory_weighted")  # Soft boost (recommended)
router.route(query, strategy="memory_only")      # Pure learning (risky)

Production Deployment Checklist

  • Wire MemoryWeighting into AdapterRouter.init()
  • Modify route() method with use_memory_boost parameter
  • Update CodetteSession to initialize memory_weighting
  • Pass memory_weighting through all routing calls
  • Update app.py/Gradio interface to pass memory context
  • Add unit test for memory-weighted routing
  • Add E2E test for full learning loop
  • Monitor: Log adapter weights after each debate cycle
  • Tune: Adjust weight formula coefficients based on results
  • Document: User-facing explanation of why adapters were selected

Monitoring & Debugging

Enable Debug Logging

import os
import logging

# In app initialization:
if os.environ.get("DEBUG_ADAPTER_ROUTING"):
    logging.basicConfig(level=logging.DEBUG)

    # This will print weight explanations on each route call

Query Adapter Weight History

from reasoning_forge.memory_weighting import MemoryWeighting

# Get snapshot of adapter weights
weights = memory_weighting.get_all_weights()
for adapter, w_dict in weights.items():
    print(f"{adapter}: weight={w_dict['weight']:.3f}")

# Explain a specific adapter's weight
explanation = memory_weighting.explain_weight("newton")
print(explanation["explanation"])
# Output: "Adapter 'newton' has used 15 times with 0.8 avg coherence,
#          73% conflict resolution rate, and 0.95 recency score.
#          Final weight: 1.45 (range [0, 2.0])"

Memory State

# Check memory cocoon counts per adapter
for cocoon in memory.memories:
    if cocoon.emotional_tag == "tension":
        print(f"Conflict: {cocoon.adapter_used}, coherence={cocoon.coherence}")

# Get emotional profile
profile = memory.emotional_profile()
print(f"Memory profile: {profile}")  # {'tension': 25, 'neutral': 10, ...}

Known Limitations & Future Work

  1. Adapter Naming: Currently stores agent pairs (e.g., "Newton,Quantum"). For pure adapter routing, need to map to actual adapter names.

  2. Cold Start: New adapters have neutral weights (1.0) until they accumulate history (~10-15 uses).

  3. Strict Mode Risk: Memory-only routing (no keywords) can ignore important query context. Test thoroughly before production.

  4. Memory Pruning: Automatic pruning at 100 memories may lose old patterns. Consider keeping high-importance conflicts longer.

  5. Next Phase: Multi-round conflict resolution tracking would enable learning across multiple debate cycles, not just single-round.


Summary

To Enable Memory-Weighted Routing:

  1. Add memory_weighting parameter to AdapterRouter.init()
  2. Modify route() to apply get_boosted_confidence() soft boost
  3. Wire through CodetteSession / app initialization
  4. Test with unit + E2E test suite
  5. Monitor weights and tune formula if needed

Recommended Approach: Soft boost (preserve keyword intelligence) β†’ can migrate to memory-only if results justify it.

Expected Outcome: Better adapter selection over time, converging to adapters that historically resolved more conflicts.