Instructions to use connaaa/interpgpt-standard-23M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use connaaa/interpgpt-standard-23M with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="connaaa/interpgpt-standard-23M", trust_remote_code=True)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("connaaa/interpgpt-standard-23M", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use connaaa/interpgpt-standard-23M with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "connaaa/interpgpt-standard-23M"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "connaaa/interpgpt-standard-23M",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/connaaa/interpgpt-standard-23M

SGLang

How to use connaaa/interpgpt-standard-23M with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "connaaa/interpgpt-standard-23M" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "connaaa/interpgpt-standard-23M",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "connaaa/interpgpt-standard-23M" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "connaaa/interpgpt-standard-23M",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use connaaa/interpgpt-standard-23M with Docker Model Runner:
```
docker model run hf.co/connaaa/interpgpt-standard-23M
```

Configuration Parsing Warning:In config.json: "architectures" must be an array

InterpGPT — Standard Model (23M)

Part of the InterpGPT matched-pair release. This is the standard model; its counterpart is connaaa/interpgpt-adhd-23M. Both models share identical architecture and training recipe; only the training data distribution differs.

	Value
Parameters	23,471,104
Layers	6
Heads	8
d_model	512
d_head	64
d_mlp (SwiGLU)	1408
Vocab	8192 (custom BPE)
Context length	512
Norm	RMSNorm (ε = 1e-6)
Position	RoPE (half-half, base 10,000)
Activation	SwiGLU
Biases	none
Tied input/output embeddings	yes
Training tokens	~25k steps on task-decomposition corpus

What is this model for?

Given a task prompt, the model writes a step-by-step decomposition. The standard variant was trained on normal task decompositions (tasks → subtasks in straightforward order). The ADHD counterpart was trained on decompositions with smaller steps and interleaved micro-regulation actions (e.g. "sip water", "deep breath", "quick stretch").

The pair is the subject of a mechanistic-interpretability study. Phase 1 headline findings:

Structural head-position swap. A step-layout-broadcast head lives at L3H0 in the standard model and at L3H5 in the ADHD model. Cross-model per-position attention profile cosine similarity is 0.997 at the matched (different-index) pair vs a same-index baseline of 0.66.
Block-2 content circuit. P(regulation token) at step-onset positions jumps 17× between layer 1 and layer 2 in the ADHD model (0.014 → 0.251); the standard model never crosses 1% at any layer.
High-specificity null-steering SAE feature. See the companion SAE repo connaaa/interpgpt-sae-phase5.

Input format

<|task|>Clean the kitchen<|steps|>Step 1 text<|sep|>Step 2 text<|sep|>...<|end|>

Loading

HuggingFace Transformers (custom code)

from transformers import AutoModel, AutoTokenizer
model = AutoModel.from_pretrained(
    "connaaa/interpgpt-standard-23M", trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(
    "connaaa/interpgpt-standard-23M"
)

TransformerLens (recommended for interpretability)

The repo ships a TransformerLens-compatible bundle at hooked_transformer.pt:

from huggingface_hub import hf_hub_download
from transformer_lens import HookedTransformer, HookedTransformerConfig
import torch

path = hf_hub_download(
    "connaaa/interpgpt-standard-23M", "hooked_transformer.pt"
)
blob = torch.load(path, map_location="cpu", weights_only=False)
cfg_keep = {
    k: v for k, v in blob["config"].items()
    if k in HookedTransformerConfig.__dataclass_fields__ and not (
        isinstance(v, str) and v.startswith("torch.")
    )
}
cfg = HookedTransformerConfig(**cfg_keep)
model = HookedTransformer(cfg)
model.load_state_dict(blob["model_state_dict"])
model.eval()

Raw PyTorch / original TaskGPT class

# Pairs with gpt_model.py from https://github.com/cwklurks/interpgpt
from huggingface_hub import hf_hub_download
from gpt_model import GPTConfig, TaskGPT
import torch

path = hf_hub_download(
    "connaaa/interpgpt-standard-23M", "pytorch_model.pt"
)
blob = torch.load(path, map_location="cpu", weights_only=False)
model = TaskGPT(GPTConfig(**blob["config"]))
model.load_state_dict(blob["model_state_dict"])

Reproduce the head-swap finding

Open the companion Colab: notebooks/InterpGPT_HeadSwap.ipynb at github.com/cwklurks/interpgpt. End-to-end run on Colab free tier reproduces the 0.997 vs 0.66 comparison in under 15 minutes.

Training data

Custom task-decomposition corpus, two variants (standard vs ADHD) generated with the same task pool. Detailed dataset notes + generation scripts live in the main repo (preprocess.py, merge_data.py, rebuild_data.py, fix_adhd_data.py, shorten_adhd_steps.py).

License

MIT.

Intended use

Interpretability research. The model is intentionally small and domain-specific; not intended as a general-purpose chatbot.

Citation

@misc{interpgpt2026,
  title  = {{InterpGPT}: A matched-pair interpretability study of task-decomposition models},
  author = {Klann, Connor},
  year   = {2026},
  url    = {https://github.com/cwklurks/interpgpt}
}

Downloads last month: 347

Safetensors

Model size

25.2M params

Tensor type

F32