Text Generation
Transformers
Safetensors
interpgpt
interpretability
mechanistic-interpretability
task-decomposition
small-language-model
transformer-lens
custom_code
Instructions to use connaaa/interpgpt-adhd-23M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use connaaa/interpgpt-adhd-23M with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="connaaa/interpgpt-adhd-23M", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("connaaa/interpgpt-adhd-23M", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use connaaa/interpgpt-adhd-23M with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "connaaa/interpgpt-adhd-23M" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "connaaa/interpgpt-adhd-23M", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/connaaa/interpgpt-adhd-23M
- SGLang
How to use connaaa/interpgpt-adhd-23M with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "connaaa/interpgpt-adhd-23M" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "connaaa/interpgpt-adhd-23M", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "connaaa/interpgpt-adhd-23M" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "connaaa/interpgpt-adhd-23M", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use connaaa/interpgpt-adhd-23M with Docker Model Runner:
docker model run hf.co/connaaa/interpgpt-adhd-23M
| license: mit | |
| library_name: transformers | |
| tags: | |
| - interpretability | |
| - mechanistic-interpretability | |
| - task-decomposition | |
| - small-language-model | |
| - transformer-lens | |
| pipeline_tag: text-generation | |
| # InterpGPT — ADHD Model (23M) | |
| Part of the **InterpGPT** matched-pair release. This is the **ADHD** model; | |
| its counterpart is | |
| [`connaaa/interpgpt-standard-23M`](https://huggingface.co/connaaa/interpgpt-standard-23M). | |
| Both models share identical architecture and training recipe; only the | |
| training data distribution differs. | |
| **ADHD variant training data**: task decompositions broken into smaller steps | |
| with interleaved micro-regulation actions ("sip water", "deep breath", | |
| "close eyes briefly", "quick stretch", "pause"). | |
| | | Value | | |
| |---|---| | |
| | Parameters | 23,471,104 | | |
| | Layers | 6 | | |
| | Heads | 8 | | |
| | d_model | 512 | | |
| | d_head | 64 | | |
| | d_mlp (SwiGLU) | 1408 | | |
| | Vocab | 8192 (custom BPE) | | |
| | Context length | 512 | | |
| | Norm | RMSNorm (ε = 1e-6) | | |
| | Position | RoPE (half-half, base 10,000) | | |
| | Activation | SwiGLU | | |
| | Biases | none | | |
| | Tied input/output embeddings | yes | | |
| | Training tokens | ~25k steps on ADHD-variant task-decomposition corpus | | |
| ## Headline findings (Phase 1) | |
| - **Structural head-position swap.** A step-layout-broadcast head lives at | |
| **L3H0** in the standard model and at **L3H5** in the ADHD model. | |
| Cross-model per-position attention profile cosine at the matched pair | |
| **0.997**; same-index baseline **0.66** (0.663 for one pair; 0.643 for another). | |
| Causal ablation confirms the functional identity: ablating L3H5 in the ADHD | |
| model drops Spearman(task_complexity × step_count) from 0.83 → 0.78 (median | |
| Δ = -0.055 across 5 seeds). | |
| - **Block-2 content circuit.** P(regulation token) at step-onset positions | |
| jumps 17× between layer 1 and layer 2 (0.014 → 0.251). The standard model | |
| never crosses 1% at any layer. | |
| - **High-specificity null-steering feature.** An ADHD-L2 SAE feature | |
| (feat 2504) fires at 59% of ADHD step-onsets vs 0.03% of standard step-onsets | |
| (~2000× cross-model asymmetry), yet **causal steering on its decoder | |
| direction produces Δ within sampling noise under all four intervention | |
| variants** (inject-std, subtract-adhd, zero-ablate, inject-upstream). | |
| See the companion SAE repo | |
| [`connaaa/interpgpt-sae-phase5`](https://huggingface.co/connaaa/interpgpt-sae-phase5). | |
| ## Loading | |
| Identical to the standard variant. See | |
| [`connaaa/interpgpt-standard-23M`](https://huggingface.co/connaaa/interpgpt-standard-23M) | |
| for `AutoModel`, TransformerLens, and raw-TaskGPT examples, substituting the | |
| repo id. | |
| ## Input format | |
| ``` | |
| <|task|>Clean the kitchen<|steps|>Step 1 text<|sep|>Step 2 text<|sep|>...<|end|> | |
| ``` | |
| ## Reproduce the head-swap finding | |
| Open the Colab at `notebooks/InterpGPT_HeadSwap.ipynb` | |
| (https://github.com/cwklurks/interpgpt). Runs end-to-end on Colab free tier in | |
| under 15 minutes. | |
| ## License | |
| MIT. | |
| ## Citation | |
| See the standard model card. | |