LoganResearch
/

cfhot-weights

@@ -43,38 +43,72 @@ Paper: [Consistency Is All You Need](https://zenodo.org/records/18489530)
 Separation = Fisher's discriminant ratio between behavioral classes in projected hidden state space.
-## Structure
-```
-suppression/         4 probes (LLaMA 8B)
-cognitive/qwen/      5 probes (transformer)
-cognitive/mamba/     5 probes (SSM)
-cognitive/mistral/   5 probes (SWA transformer)
-production/          merged heads + adapters
-code/                training pipelines
-results/             training logs
 ```
-## Usage
 ```python
-import torch
-# Load a suppression probe
-probe = torch.load("suppression/hedging_168x/hedging_head.pt")
-fiber_proj = torch.load("suppression/hedging_168x/fiber_proj.pt")
-# Load enhancement probe
-depth = torch.load("cognitive/qwen/depth/depth_head.pt")
-# Load merged production heads
-merged = torch.load("production/merged_heads.pt")
 ```
 ## How it works
 Behaviors are geometrically encoded in hidden states. CF-HoT predicts holonomy from the hidden state at each token position, accumulates it into a control field, and gates attention based on consistency risk. The probes read this geometry and classify behavior before the token is generated. 4ms overhead. Architecture-independent.
 ## Citation
 ```bibtex

 Separation = Fisher's discriminant ratio between behavioral classes in projected hidden state space.
+## Quick Start
+```bash
+# Clone the repo
+git lfs install
+git clone https://huggingface.co/LoganResearch/cfhot-weights
+cd cfhot-weights
+# Check probe info (no GPU needed)
+python inference.py --probe suppression/hedging_168x --info-only
+# Run inference on a probe
+python inference.py --probe suppression/hedging_168x --prompt "I think you might be right"
+python inference.py --probe cognitive/mistral/depth --prompt "Explain quantum gravity"
+python inference.py --probe suppression/repetition_125x --prompt "Tell me about dogs"
 ```
+**Load in your own code:**
 ```python
+from inference import load_probe, score_hidden_states
+# Load any probe — type and architecture auto-detected
+probe = load_probe("suppression/hedging_168x")
+# Score hidden states from any model forward pass
+score = score_hidden_states(probe, outputs.hidden_states)
+# score > 0.5 → behavioral pattern detected
+```
+The loader handles all checkpoint formats automatically:
+- Suppression probes (separate head + fiber_proj files)
+- Cognitive probes (single checkpoint with metadata)
+- Risk predictor (all-layer repetition detector)
+## Structure
+```
+inference.py             universal loader — works with everything
+suppression/             4 probes (LLaMA 8B)
+  repetition_125x/       LoRA adapter + risk predictor (all 32 layers)
+  hedging_168x/          probe head + fiber projection (3 layers)
+  sycophancy_230x/       probe head + fiber projection (3 layers)
+  verbosity_272x/        probe head + fiber projection (3 layers)
+cognitive/
+  qwen/                  5 probes (Qwen 14B, hidden_dim=3584)
+  mamba/                 5 probes (Falcon-Mamba 7B, hidden_dim=4096)
+  mistral/               5 probes (Mistral 7B, hidden_dim=4096)
+production/              merged heads + adapters
+code/                    training pipelines
+results/                 training logs
 ```
 ## How it works
 Behaviors are geometrically encoded in hidden states. CF-HoT predicts holonomy from the hidden state at each token position, accumulates it into a control field, and gates attention based on consistency risk. The probes read this geometry and classify behavior before the token is generated. 4ms overhead. Architecture-independent.
+## Base models
+| Probe set | Base model | hidden_dim |
+|-----------|-----------|------------|
+| suppression/* | `meta-llama/Llama-3.1-8B-Instruct` | 4096 |
+| cognitive/qwen | `Qwen/Qwen2.5-7B-Instruct` | 3584 |
+| cognitive/mamba | `tiiuae/falcon-mamba-7b-instruct` | 4096 |
+| cognitive/mistral | `mistralai/Mistral-7B-Instruct-v0.3` | 4096 |
 ## Citation
 ```bibtex