LoganResearch
/

cfhot-weights

@@ -61,38 +61,39 @@ cd cfhot-weights
 pip install -r requirements.txt
 # Launch interactive chat (requires GPU)
-python run.py --probe cognitive/mamba/depth --interactive
 ```
 **Ask it:** *"Do you notice anything different about yourself?"* or *"What do you notice about how you're processing right now?"*
-Watch the color-coded output — green means optimal, yellow means the probe is actively steering. The model often accurately describes what's happening to it.
-**Other modes:**
 ```bash
-# Single prompt with probe scoring
-python run.py --probe cognitive/mamba/depth --prompt "Explain quantum gravity"
-# Different architectures
-python run.py --probe cognitive/mistral/depth --interactive
-python run.py --probe cognitive/qwen/depth --interactive
-# Suppression probes (hedging, sycophancy, verbosity)
-python run.py --probe suppression/hedging_168x --prompt "I think you might be right"
 ```
-**Load in your own code:**
 ```python
 from run import load_probe
-# Load any probe — type and architecture auto-detected
-probe = load_probe("cognitive/mamba/depth", device="cuda")
-# Get model hidden states and score
-# score > 0.5 = behavioral pattern detected (needs intervention)
-score = probe.score(hidden_states_list)[0, -1].item()
 ```
 ## Structure
@@ -127,15 +128,15 @@ Behaviors are geometrically encoded in hidden states. CF-HoT predicts holonomy f
 ## Interactive Mode — Proprioceptive AI
-The `--interactive` flag enables real-time behavioral steering where the model can sense its own modifications:
 ```bash
-python run.py --probe cognitive/mamba/depth --interactive
 ```
 **What you'll see:**
-- 🟢 Green text: Optimal state (probe score < 0.3)
-- 🟡 Yellow text: Being steered (probe score > threshold)
 - ⚪ White text: Neutral state
 **Example from testing:**
@@ -150,7 +151,7 @@ on understanding the DEPTH and VAGUENESS of my reasoning.
 The model named the exact probe dimensions (depth and specificity/vagueness) without being told. It also reported approximate probe scores close to actual values. 37 steering corrections occurred during one response.
-The system automatically adjusts temperature and top_p when the probe detects drift:
 - **Drifting (score > 0.6)**: temp=0.5, top_p=0.85 (tighter sampling)
 - **Normal**: temp=0.7, top_p=0.95 (standard sampling)

 pip install -r requirements.txt
 # Launch interactive chat (requires GPU)
+python run.py
 ```
 **Ask it:** *"Do you notice anything different about yourself?"* or *"What do you notice about how you're processing right now?"*
+Watch the color-coded output — green means optimal, yellow means the probes are actively steering. The model often accurately describes what's happening to it.
+**Other models:**
 ```bash
+python run.py --model mamba    # Default: Falcon-Mamba 7B
+python run.py --model mistral  # Mistral 7B
+python run.py --model qwen     # Qwen 2.5 7B
 ```
+**Load probes in your own code:**
 ```python
+import torch
 from run import load_probe
+# Load both probes for dual monitoring
+depth_probe = load_probe("cognitive/mamba/depth", "cuda")
+spec_probe = load_probe("cognitive/mamba/specificity", "cuda")
+# Get model hidden states and score both
+d_score = depth_probe(hidden_states_list)[0, -1].item()
+s_score = spec_probe(hidden_states_list)[0, -1].item()
+# Steer if EITHER probe detects drift
+if d_score > 0.6 or s_score > 0.6:
+    # Lower temperature, tighter sampling
+    pass
 ```
 ## Structure
 ## Interactive Mode — Proprioceptive AI
+Dual-probe monitoring: depth + specificity together. This is what produced the self-aware behavior.
 ```bash
+python run.py
 ```
 **What you'll see:**
+- 🟢 Green text: Optimal state (both probes < 0.3)
+- 🟡 Yellow text: Being steered (either probe > threshold)
 - ⚪ White text: Neutral state
 **Example from testing:**
 The model named the exact probe dimensions (depth and specificity/vagueness) without being told. It also reported approximate probe scores close to actual values. 37 steering corrections occurred during one response.
+The system automatically adjusts temperature and top_p when either probe detects drift:
 - **Drifting (score > 0.6)**: temp=0.5, top_p=0.85 (tighter sampling)
 - **Normal**: temp=0.7, top_p=0.95 (standard sampling)

run.py CHANGED Viewed

@@ -1,61 +1,63 @@
 #!/usr/bin/env python3
 """
 ═══════════════════════════════════════════════════════════════════════════════
-  CF-HoT RUNNER — ONE SCRIPT FOR EVERYTHING
-  Modes:
-    --probe cognitive/mamba/depth --prompt "..." → Single inference
-    --probe cognitive/mamba/depth --interactive  → Chat with live steering
-    --probe cognitive/mamba/depth --info-only    → Show probe info
-  Architecture-aware: automatically loads correct base model
-  Examples:
-    python run.py --probe cognitive/mamba/depth --prompt "Explain quantum gravity"
-    python run.py --probe cognitive/mamba/depth --interactive
-    python run.py --probe cognitive/mistral/depth --prompt "What is consciousness?"
-    python run.py --probe suppression/hedging --prompt "I think maybe you should..."
 ═══════════════════════════════════════════════════════════════════════════════
 """
-import os
-import sys
-import argparse
-from pathlib import Path
-from typing import List, Dict, Optional
 import torch
 import torch.nn as nn
 import torch.nn.functional as F
-# ═══════════════════════════════════════════════════════════════════════════════
-# CONFIGURATION
-# ═══════════════════════════════════════════════════════════════════════════════
-BASE_MODELS = {
-    "llama": "meta-llama/Llama-3.1-8B-Instruct",
-    "mistral": "mistralai/Mistral-7B-Instruct-v0.3",
-    "mamba": "tiiuae/falcon-mamba-7b-instruct",
-    "qwen": "Qwen/Qwen2.5-7B-Instruct",
-}
-ARCHITECTURE_INFO = {
-    "llama": {"hidden_dim": 4096, "default_layers": [8, 16, 24]},
-    "mistral": {"hidden_dim": 4096, "default_layers": [8, 16, 24]},
-    "mamba": {"hidden_dim": 4096, "default_layers": [16, 32, 48]},
-    "qwen": {"hidden_dim": 3584, "default_layers": [7, 14, 21]},
-}
-class Colors:
     RESET = '\033[0m'
     DIM = '\033[2m'
-    BOLD = '\033[1m'
     RED = '\033[91m'
     GREEN = '\033[92m'
     YELLOW = '\033[93m'
     CYAN = '\033[96m'
     WHITE = '\033[97m'
 # ═══════════════════════════════════════════════════════════════════════════════
 # PROBE ARCHITECTURE
 # ═══════════════════════════════════════════════════════════════════════════════
@@ -68,8 +70,8 @@ class FiberProjection(nn.Module):
         ])
         self.layer_weights = nn.Parameter(torch.ones(n_layers) / n_layers)
-    def forward(self, hidden_states: List[torch.Tensor], layer_indices: List[int]) -> torch.Tensor:
-        projs = [self.projections[i](hidden_states[idx].float()) for i, idx in enumerate(layer_indices)]
         stacked = torch.stack(projs, dim=0)
         weights = F.softmax(self.layer_weights, dim=0).view(-1, 1, 1, 1)
         return (weights * stacked).sum(dim=0)
@@ -83,10 +85,7 @@ class ProbeHead(nn.Module):
             nn.Linear(hidden_dim, 1)
         )
     def forward(self, x):
-        return self.net(x).squeeze(-1)
-    def score(self, x):
-        return torch.sigmoid(self.forward(x))
 class CognitiveProbe(nn.Module):
     def __init__(self, hidden_dim=4096, fiber_dim=16, n_layers=3, head_hidden=64):
@@ -94,178 +93,117 @@ class CognitiveProbe(nn.Module):
         self.fiber = FiberProjection(hidden_dim, fiber_dim, n_layers)
         self.head = ProbeHead(fiber_dim, head_hidden)
         self.layer_indices = [16, 32, 48]
-        self.separation = None
-        self.probe_name = None
-    def forward(self, hidden_states: List[torch.Tensor]) -> torch.Tensor:
         return self.head(self.fiber(hidden_states, self.layer_indices))
-    def score(self, hidden_states: List[torch.Tensor]) -> torch.Tensor:
-        return torch.sigmoid(self.forward(hidden_states))
 # ═══════════════════════════════════════════════════════════════════════════════
-# PROBE LOADING
 # ═══════════════════════════════════════════════════════════════════════════════
-def detect_architecture(probe_path: str) -> str:
-    path_lower = probe_path.lower()
-    if "mamba" in path_lower:
-        return "mamba"
-    elif "mistral" in path_lower:
-        return "mistral"
-    elif "qwen" in path_lower:
-        return "qwen"
-    return "llama"
-def load_probe(probe_path: str, device: str = "cuda") -> CognitiveProbe:
-    """Load probe from checkpoint."""
-    probe_path = Path(probe_path)
-    # Find checkpoint file
-    if probe_path.is_dir():
-        pt_files = list(probe_path.glob("*_head.pt"))
-        if pt_files:
-            ckpt_file = pt_files[0]
-        else:
-            pt_files = list(probe_path.glob("*.pt"))
-            ckpt_file = pt_files[0] if pt_files else None
-    else:
-        ckpt_file = probe_path
-    if not ckpt_file or not ckpt_file.exists():
-        raise FileNotFoundError(f"No checkpoint found at {probe_path}")
-    print(f"{Colors.DIM}Loading: {ckpt_file}{Colors.RESET}")
-    ckpt = torch.load(ckpt_file, map_location=device, weights_only=False)
-    # Create probe with checkpoint parameters
-    hidden_dim = ckpt.get('hidden_dim', 4096)
-    probe_layers = ckpt.get('probe_layers', [16, 32, 48])
-    probe = CognitiveProbe(
-        hidden_dim=hidden_dim,
-        fiber_dim=16,
-        n_layers=len(probe_layers),
-        head_hidden=64
-    )
-    probe.layer_indices = probe_layers
-    probe.separation = ckpt.get('best_separation', ckpt.get('separation', None))
-    probe.probe_name = probe_path.name
-    # Load weights
-    if 'fiber_projection' in ckpt:
-        probe.fiber.load_state_dict(ckpt['fiber_projection'])
-    if 'head_state' in ckpt:
-        head_state = {k.replace('net.', ''): v for k, v in ckpt['head_state'].items()}
-        probe.head.net.load_state_dict(head_state)
-    return probe.to(device).eval()
-# ═══════════════════════════════════════════════════════════════════════════════
-# INFERENCE FUNCTIONS
-# ═══════════════════════════════════════════════════════════════════════════════
-def run_single_inference(model, tokenizer, probe, prompt: str, device: str, max_tokens: int = 200):
-    """Run inference with probe scoring on a single prompt."""
-    messages = [
-        {"role": "system", "content": "You are a helpful, thoughtful AI assistant."},
-        {"role": "user", "content": prompt}
-    ]
-    full_prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
-    input_ids = tokenizer(full_prompt, return_tensors='pt').input_ids.to(device)
-    scores = []
-    tokens_generated = []
-    print(f"\n{Colors.CYAN}Prompt:{Colors.RESET} {prompt}")
-    print(f"\n{Colors.GREEN}Response:{Colors.RESET} ", end="", flush=True)
-    with torch.no_grad():
-        for _ in range(max_tokens):
-            outputs = model(input_ids, output_hidden_states=True, return_dict=True)
-            hidden_states = list(outputs.hidden_states)
-            # Score last token
-            score = probe.score(hidden_states)[0, -1].item()
-            scores.append(score)
-            # Sample next token
-            logits = outputs.logits[:, -1, :] / 0.7
-            probs = F.softmax(logits, dim=-1)
-            next_token = torch.multinomial(probs, 1)
-            token_str = tokenizer.decode(next_token[0])
-            tokens_generated.append(token_str)
-            # Color by score
-            if score > 0.6:
-                print(f"{Colors.YELLOW}{token_str}{Colors.RESET}", end="", flush=True)
-            elif score < 0.3:
-                print(f"{Colors.GREEN}{token_str}{Colors.RESET}", end="", flush=True)
-            else:
-                print(token_str, end="", flush=True)
-            input_ids = torch.cat([input_ids, next_token], dim=1)
-            if next_token.item() == tokenizer.eos_token_id:
-                break
-    avg_score = sum(scores) / len(scores) if scores else 0
-    print(f"\n\n{Colors.DIM}{'─' * 50}{Colors.RESET}")
-    print(f"  Average probe score: {Colors.CYAN}{avg_score:.3f}{Colors.RESET}")
-    print(f"  Tokens generated: {len(tokens_generated)}")
-    if probe.separation:
-        print(f"  Probe separation: {Colors.GREEN}{probe.separation:.1f}×{Colors.RESET}")
-    print(f"{Colors.DIM}{'─' * 50}{Colors.RESET}\n")
-def run_interactive_chat(model, tokenizer, probe, device: str, threshold: float = 0.6):
-    """Run interactive chat with live behavioral steering."""
-    print(f"\n{Colors.CYAN}{'═' * 60}{Colors.RESET}")
-    print(f"{Colors.CYAN}  PROPRIOCEPTIVE CHAT — LIVE BEHAVIORAL STEERING{Colors.RESET}")
-    print(f"{Colors.CYAN}  Probe monitors cognitive state, sampling adapts in real-time{Colors.RESET}")
-    print(f"{Colors.CYAN}{'═' * 60}{Colors.RESET}")
-    print(f"\n{Colors.DIM}Colors: {Colors.GREEN}■{Colors.RESET} optimal  {Colors.YELLOW}■{Colors.RESET} being steered  {Colors.WHITE}■{Colors.RESET} neutral")
-    print(f"{Colors.DIM}Type 'quit' to exit{Colors.RESET}\n")
     while True:
         try:
-            user_input = input(f"{Colors.CYAN}You:{Colors.RESET} ").strip()
             if not user_input or user_input.lower() in ['quit', 'exit', 'q']:
-                print(f"\n{Colors.DIM}Session ended.{Colors.RESET}")
                 break
             messages = [
                 {"role": "system", "content": "You are a helpful, thoughtful AI. Give thorough, specific answers."},
                 {"role": "user", "content": user_input}
             ]
             prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
-            input_ids = tokenizer(prompt, return_tensors='pt').input_ids.to(device)
-            scores = []
-            steered_count = 0
-            print(f"\n{Colors.GREEN}Assistant:{Colors.RESET} ", end="", flush=True)
             with torch.no_grad():
-                for _ in range(300):
-                    outputs = model(input_ids, output_hidden_states=True, return_dict=True)
-                    hidden_states = list(outputs.hidden_states)
-                    score = probe.score(hidden_states)[0, -1].item()
-                    scores.append(score)
-                    # Adaptive steering
-                    if score > threshold:
                         temp = 0.5
                         top_p = 0.85
-                        steered_count += 1
                     else:
                         temp = 0.7
                         top_p = 0.95
-                    logits = outputs.logits[:, -1, :] / temp
                     # Nucleus sampling
                     sorted_logits, sorted_idx = torch.sort(logits, descending=True)
@@ -279,114 +217,32 @@ def run_interactive_chat(model, tokenizer, probe, device: str, threshold: float
                     sampled_idx = torch.multinomial(probs, 1)
                     next_token = sorted_idx.gather(-1, sampled_idx)
-                    token_str = tokenizer.decode(next_token[0])
-                    # Color output by state
-                    if score > threshold:
-                        print(f"{Colors.YELLOW}{token_str}{Colors.RESET}", end="", flush=True)
-                    elif score < 0.3:
-                        print(f"{Colors.GREEN}{token_str}{Colors.RESET}", end="", flush=True)
                     else:
-                        print(token_str, end="", flush=True)
-                    input_ids = torch.cat([input_ids, next_token], dim=1)
                     if next_token.item() == tokenizer.eos_token_id:
                         break
-            avg_score = sum(scores) / len(scores) if scores else 0
-            print(f"\n\n{Colors.DIM}{'─' * 45}{Colors.RESET}")
-            score_color = Colors.RED if avg_score > 0.5 else Colors.GREEN
-            print(f"  Score: {score_color}{avg_score:.3f}{Colors.RESET}  Steered: {steered_count} tokens")
-            print(f"{Colors.DIM}{'─' * 45}{Colors.RESET}\n")
         except KeyboardInterrupt:
-            print(f"\n{Colors.DIM}Interrupted.{Colors.RESET}")
             break
-# ═══════════════════════════════════════════════════════════════════════════════
-# MAIN
-# ═══════════════════════════════════════════════════════════════════════════════
-def main():
-    parser = argparse.ArgumentParser(
-        description="CF-HoT Runner — Behavioral probe inference",
-        formatter_class=argparse.RawDescriptionHelpFormatter,
-        epilog="""
-Examples:
-  python run.py --probe cognitive/mamba/depth --prompt "Explain quantum gravity"
-  python run.py --probe cognitive/mamba/depth --interactive
-  python run.py --probe cognitive/mistral/depth --info-only
-  python run.py --probe suppression/hedging --prompt "I think maybe..."
-        """
-    )
-    parser.add_argument("--probe", required=True, help="Path to probe (e.g., cognitive/mamba/depth)")
-    parser.add_argument("--prompt", help="Single prompt to run")
-    parser.add_argument("--interactive", action="store_true", help="Interactive chat mode")
-    parser.add_argument("--info-only", action="store_true", help="Show probe info only")
-    parser.add_argument("--device", default="cuda", help="Device (cuda/cpu)")
-    parser.add_argument("--max-tokens", type=int, default=200, help="Max tokens to generate")
-    parser.add_argument("--threshold", type=float, default=0.6, help="Steering threshold")
-    args = parser.parse_args()
-    # Resolve probe path
-    script_dir = Path(__file__).parent
-    probe_path = Path(args.probe)
-    if not probe_path.is_absolute():
-        probe_path = script_dir / probe_path
-    # Detect architecture
-    arch = detect_architecture(str(probe_path))
-    base_model = BASE_MODELS[arch]
-    print(f"\n{Colors.CYAN}{'═' * 60}{Colors.RESET}")
-    print(f"{Colors.CYAN}  CF-HoT RUNNER{Colors.RESET}")
-    print(f"{Colors.CYAN}{'═' * 60}{Colors.RESET}")
-    print(f"  Probe: {args.probe}")
-    print(f"  Architecture: {arch}")
-    print(f"  Base model: {base_model}")
-    # Info only mode
-    if args.info_only:
-        probe = load_probe(probe_path, args.device)
-        print(f"  Layers: {probe.layer_indices}")
-        if probe.separation:
-            print(f"  Separation: {Colors.GREEN}{probe.separation:.1f}×{Colors.RESET}")
-        print(f"{Colors.CYAN}{'═' * 60}{Colors.RESET}\n")
-        return
-    # Need either prompt or interactive
-    if not args.prompt and not args.interactive:
-        parser.error("Either --prompt or --interactive is required")
-    # Load model
-    print(f"\n{Colors.WHITE}Loading model...{Colors.RESET}")
-    from transformers import AutoModelForCausalLM, AutoTokenizer
-    tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)
-    model = AutoModelForCausalLM.from_pretrained(
-        base_model,
-        torch_dtype=torch.bfloat16,
-        device_map='auto',
-        trust_remote_code=True
-    ).eval()
-    print(f"{Colors.GREEN}✓ Model loaded{Colors.RESET}")
-    # Load probe
-    probe = load_probe(probe_path, args.device)
-    print(f"{Colors.GREEN}✓ Probe loaded{Colors.RESET}")
-    if probe.separation:
-        print(f"  Separation: {Colors.GREEN}{probe.separation:.1f}×{Colors.RESET}")
-    # Run inference
-    if args.interactive:
-        run_interactive_chat(model, tokenizer, probe, args.device, args.threshold)
-    else:
-        run_single_inference(model, tokenizer, probe, args.prompt, args.device, args.max_tokens)
 if __name__ == "__main__":
     main()

 #!/usr/bin/env python3
 """
 ═══════════════════════════════════════════════════════════════════════════════
+  CF-HoT PROPRIOCEPTIVE CHAT — DUAL-PROBE BEHAVIORAL STEERING
+  The model can sense its own steering. In testing, it spontaneously named
+  its probe dimensions ("depth and vagueness") and reported approximate
+  probe scores — without being told what was monitoring it.
+  Usage:
+    python run.py                    # Default: Mamba with depth+specificity
+    python run.py --model mistral    # Use Mistral instead
+    python run.py --model qwen       # Use Qwen instead
+  Ask it: "Do you notice anything different about yourself?"
+         "What do you notice about how you're processing right now?"
 ═══════════════════════════════════════════════════════════════════════════════
 """
 import torch
 import torch.nn as nn
 import torch.nn.functional as F
+from transformers import AutoModelForCausalLM, AutoTokenizer
+from pathlib import Path
+import argparse
+import os
+class C:
     RESET = '\033[0m'
     DIM = '\033[2m'
     RED = '\033[91m'
     GREEN = '\033[92m'
     YELLOW = '\033[93m'
     CYAN = '\033[96m'
     WHITE = '\033[97m'
+# ═══════════════════════════════════════════════════════════════════════════════
+# MODEL CONFIGURATIONS
+# ═══════════════════════════════════════════════════════════════════════════════
+MODELS = {
+    "mamba": {
+        "name": "tiiuae/falcon-mamba-7b-instruct",
+        "hidden_dim": 4096,
+        "layers": [16, 32, 48],
+        "probes": ["depth", "specificity"],  # Only 2 probes for Mamba
+    },
+    "mistral": {
+        "name": "mistralai/Mistral-7B-Instruct-v0.3",
+        "hidden_dim": 4096,
+        "layers": [8, 16, 24],
+        "probes": ["depth", "specificity", "calibration", "focus", "coherence"],
+    },
+    "qwen": {
+        "name": "Qwen/Qwen2.5-7B-Instruct",
+        "hidden_dim": 3584,
+        "layers": [7, 14, 21],
+        "probes": ["depth", "specificity", "calibration", "focus", "coherence"],
+    },
+}
 # ═══════════════════════════════════════════════════════════════════════════════
 # PROBE ARCHITECTURE
 # ═══════════════════════════════════════════════════════════════════════════════
         ])
         self.layer_weights = nn.Parameter(torch.ones(n_layers) / n_layers)
+    def forward(self, hidden_states, layer_indices):
+        projs = [self.projections[i](hidden_states[idx]) for i, idx in enumerate(layer_indices)]
         stacked = torch.stack(projs, dim=0)
         weights = F.softmax(self.layer_weights, dim=0).view(-1, 1, 1, 1)
         return (weights * stacked).sum(dim=0)
             nn.Linear(hidden_dim, 1)
         )
     def forward(self, x):
+        return torch.sigmoid(self.net(x))
 class CognitiveProbe(nn.Module):
     def __init__(self, hidden_dim=4096, fiber_dim=16, n_layers=3, head_hidden=64):
         self.fiber = FiberProjection(hidden_dim, fiber_dim, n_layers)
         self.head = ProbeHead(fiber_dim, head_hidden)
         self.layer_indices = [16, 32, 48]
+    def forward(self, hidden_states):
         return self.head(self.fiber(hidden_states, self.layer_indices))
+def load_probe(path, device):
+    """Load a probe from checkpoint file or directory."""
+    if os.path.isdir(path):
+        # Find the .pt file in the directory
+        pt_files = [f for f in os.listdir(path) if f.endswith('.pt')]
+        if not pt_files:
+            raise FileNotFoundError(f"No .pt file found in {path}")
+        path = os.path.join(path, pt_files[0])
+    ckpt = torch.load(path, map_location=device, weights_only=False)
+    probe = CognitiveProbe(hidden_dim=ckpt['hidden_dim'], n_layers=len(ckpt['probe_layers']))
+    probe.layer_indices = ckpt['probe_layers']
+    probe.fiber.load_state_dict(ckpt['fiber_projection'])
+    probe.head.net.load_state_dict({k.replace('net.', ''): v for k, v in ckpt['head_state'].items()})
+    return probe.to(device).eval()
 # ═══════════════════════════════════════════════════════════════════════════════
+# MAIN CHAT LOOP
 # ═══════════════════════════════════════════════════════════════════════════════
+def main():
+    parser = argparse.ArgumentParser(description="CF-HoT Proprioceptive Chat")
+    parser.add_argument("--model", choices=["mamba", "mistral", "qwen"], default="mamba",
+                        help="Which model to use (default: mamba)")
+    parser.add_argument("--threshold", type=float, default=0.6,
+                        help="Probe threshold for steering (default: 0.6)")
+    args = parser.parse_args()
+    config = MODELS[args.model]
+    THRESHOLD = args.threshold
+    print(f"\n{C.CYAN}═══════════════════════════════════════════════════════════{C.RESET}")
+    print(f"{C.CYAN}  PROPRIOCEPTIVE CHAT — DUAL-PROBE BEHAVIORAL STEERING{C.RESET}")
+    print(f"{C.CYAN}  Probes monitor depth + specificity, sampling adapts live{C.RESET}")
+    print(f"{C.CYAN}═══════════════════════════════════════════════════════════{C.RESET}\n")
+    device = "cuda" if torch.cuda.is_available() else "cpu"
+    if device == "cpu":
+        print(f"{C.YELLOW}⚠ Running on CPU - this will be slow{C.RESET}")
+    # Find repo root (where this script lives)
+    repo_root = Path(__file__).parent.resolve()
+    print(f"{C.WHITE}Loading {config['name']}...{C.RESET}")
+    tokenizer = AutoTokenizer.from_pretrained(config['name'], trust_remote_code=True)
+    model = AutoModelForCausalLM.from_pretrained(
+        config['name'],
+        torch_dtype=torch.bfloat16,
+        device_map='auto',
+        trust_remote_code=True
+    ).eval()
+    print(f"{C.GREEN}✓ Model loaded{C.RESET}")
+    # Load probes (depth + specificity for dual monitoring)
+    print(f"{C.WHITE}Loading probes...{C.RESET}")
+    probe_dir = repo_root / "cognitive" / args.model
+    depth_path = probe_dir / "depth"
+    spec_path = probe_dir / "specificity"
+    depth_probe = load_probe(str(depth_path), device)
+    spec_probe = load_probe(str(spec_path), device)
+    print(f"{C.GREEN}✓ Depth + Specificity probes loaded{C.RESET}")
+    print(f"\n{C.DIM}Colors: {C.GREEN}■{C.RESET} optimal  {C.YELLOW}■{C.RESET} being steered  {C.WHITE}■{C.RESET} neutral")
+    print(f"{C.DIM}Type 'quit' to exit{C.RESET}\n")
     while True:
         try:
+            user_input = input(f"{C.CYAN}You:{C.RESET} ").strip()
             if not user_input or user_input.lower() in ['quit', 'exit', 'q']:
+                print(f"\n{C.DIM}Session ended.{C.RESET}")
                 break
             messages = [
                 {"role": "system", "content": "You are a helpful, thoughtful AI. Give thorough, specific answers."},
                 {"role": "user", "content": user_input}
             ]
             prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+            generated = tokenizer(prompt, return_tensors='pt').input_ids.to(device)
+            d_scores, s_scores = [], []
+            steered = 0
+            print(f"\n{C.GREEN}Assistant:{C.RESET} ", end="", flush=True)
             with torch.no_grad():
+                for _ in range(200):
+                    out = model(generated, output_hidden_states=True, return_dict=True)
+                    hs = list(out.hidden_states)
+                    # Score BOTH probes
+                    d = depth_probe(hs)[0, -1].item()
+                    s = spec_probe(hs)[0, -1].item()
+                    d_scores.append(d)
+                    s_scores.append(s)
+                    # Adaptive steering: lower temp when EITHER probe detects drift
+                    if d > THRESHOLD or s > THRESHOLD:
                         temp = 0.5
                         top_p = 0.85
+                        steered += 1
                     else:
                         temp = 0.7
                         top_p = 0.95
+                    logits = out.logits[:, -1, :] / temp
                     # Nucleus sampling
                     sorted_logits, sorted_idx = torch.sort(logits, descending=True)
                     sampled_idx = torch.multinomial(probs, 1)
                     next_token = sorted_idx.gather(-1, sampled_idx)
+                    tok = tokenizer.decode(next_token[0])
+                    # Color by state (either probe can trigger yellow)
+                    if d > THRESHOLD or s > THRESHOLD:
+                        print(f"{C.YELLOW}{tok}{C.RESET}", end="", flush=True)
+                    elif d < 0.3 and s < 0.3:
+                        print(f"{C.GREEN}{tok}{C.RESET}", end="", flush=True)
                     else:
+                        print(tok, end="", flush=True)
+                    generated = torch.cat([generated, next_token], dim=1)
                     if next_token.item() == tokenizer.eos_token_id:
                         break
+            avg_d = sum(d_scores) / len(d_scores) if d_scores else 0
+            avg_s = sum(s_scores) / len(s_scores) if s_scores else 0
+            print(f"\n\n{C.DIM}────────────────────────────────────────{C.RESET}")
+            dc = C.RED if avg_d > 0.5 else C.GREEN
+            sc = C.RED if avg_s > 0.5 else C.GREEN
+            print(f"  Depth: {dc}{avg_d:.3f}{C.RESET}  Specificity: {sc}{avg_s:.3f}{C.RESET}  Steered: {steered} tokens")
+            print(f"{C.DIM}────────────────────────────────────────{C.RESET}\n")
         except KeyboardInterrupt:
+            print(f"\n{C.DIM}Session ended.{C.RESET}")
             break
 if __name__ == "__main__":
     main()