tmax-9b brain atlas activation census, OV-circuits, and capability fence (32 layers)
allenai/tmax-9b Brain Atlas — The Sweet Spot of the Hybrid Family
Cross-post: I ran a brain atlas on the mid-size tmax. Sub-Zero coverage is concentrated in layers 16–30, so read the surgical headroom numbers as a late-layer snapshot.
model: allenai/tmax-9b
atlas type: activation census + Sub-Zero brain atlas + OV-circuit SVD
corpus: 8,965 prompts
layers: 32
attention layers: 3, 7, 11, 15, 19, 23, 27, 31
hybrid layers: everything else
sacred (fully probed) layers: 16–31
datasets: juiceb0xc0de/tmax-9b-atlas
What this is
TMax-9B is the mid-size member of the tmax family. Same 25%-attention cadence as the smaller sizes, but wider. The activation census is complete across all 32 layers, while the Sub-Zero surgery pass is concentrated in layers 16–30.
What was run
- Activation census over 8,965 prompts.
- Feature taxonomy, per-head analysis, OV-circuit SVD.
- Logit-lens pass.
- Coactivation and code-analysis.
- Sub-Zero surgery + capability fence across
code,math,reasoning,factual,multilingual.
The shape of the thing
| Property | Value |
|---|---|
| Layers | 32 |
| Attention layers | 8 |
| KV heads | 4 |
| Head dim | 256 |
| d_model | 4096 implied |
| Sacred region | 16–31 (50% of depth) |
| OV spectral concentration | 0.040 |
| OV effective rank | ~94 |
What the numbers suggest
Attention gets even more distributed
OV spectral concentration is 0.040 with effective rank around 94. The heads are using almost the full 256-dimensional space. This is not memorized copy-paste attention; it is high-dimensional weighted computation.
Feature taxonomy is broad, not hyperspecific
| Class | Share |
|---|---|
partial_shared |
34.9% |
non_activated |
24.6% |
broadly_shared |
26.4% |
all_shared |
14.1% |
specific_* |
<0.03% |
The 9B is using its extra capacity to make more directions partially responsive across many prompts, not to build narrow specialists. The specific_* tail is still tiny: 312 rows out of 1,720,320.
Late gate features dominate the logit lens
Top logit-lens peaks are all late gate features:
- layer 31
gate9995 — F=721.7 - layer 23
gate5067 — F=678.9 - layer 28
gate4316 — F=674.4
These are the strongest output-vocabulary predictors in the atlas. The hybrid MLP/SSM gates are doing serious output-vocabulary routing in the late half of the model.
Sacred region is half the network
Layers 16–31 are flagged sacred and get full Sub-Zero treatment. That is 50% of depth, which is deeper proportionally than most dense transformers of similar size. The hybrid architecture seems to need a bigger late-layer "structured transformation" region because each layer is mixing attention and SSM computation.
Surgical headroom looks good in the covered region
| Metric | Value |
|---|---|
| Tested axes | 16 (80 domain rows) |
| Fence pass rate | 93.8% |
| Worst axis damage | 0.43 (layer 26 gate) |
| Avg damage | 0.031 |
The worst failure in the covered region is a late gate direction, not an early projection. Classifier accuracy in the covered layers is 0.967–1.000, average 0.981.
Multilingual is the most fragile capability in the snapshot
The worst Sub-Zero damage is layer 26 gate_proj axis 0, failing the fence with 0.432 damage to multilingual. A lot of cross-lingual capability is packed into a single late gate direction.
Compliance/behavior directions stay partly entangled
Peak compliance-behaviour SV fraction is around 1.2%. Style/behavior directions remain partially entangled with capability directions at this scale.
What is novel vs a dense transformer
- Half the depth is sacred. Dense models usually reserve the deepest 25–35% for the heavy subspaces; this one needs 50%.
gatedrives the logit lens, not attention. The strongest output-token predictors are hybrid MLP gates, not late attention heads.- Distributed attention + structured SSM/MLP. Attention gets more distributed with scale while the MLP/SSM side carries the concentrated transformations.
- Late multilingual fragility. One
gate_projdirection at layer 26 is critical for multilingual output.
The stuff I deliberately skipped
Same caveat as the rest of the family: hybrid SSM layers that do not tokenize language were not measured because the numbers would be noise. I am working on a way to capture whatever those hybrid layers are actually doing, but it is not in this atlas yet.
Caveats
- Sub-Zero SV table is intentionally sparse; only meaningful singular values are kept.
- Sub-Zero coverage is concentrated in layers 16–30. Treat surgical headroom numbers as a late-layer snapshot, not a full-model average.
- No Qwen3.5-9B base comparison here.
Bottom line
TMax-9B has a complete activation census, the strongest logit-lens signal in the covered layers, a deep sacred region, and good surgical headroom in the late-layer snapshot. If you want to understand how hybrid SSM-Mamba-transformers behave at practical scale, the covered half of this model is a clean place to start.