Papers
arxiv:2605.12770

WriteSAE: Sparse Autoencoders for Recurrent State

Published on May 12
· Submitted by
Jack Young
on May 14
Authors:

Abstract

WriteSAE enables sparse autoencoder decomposition and editing of matrix cache writes in state-space and hybrid recurrent language models, achieving superior performance in token-level interventions compared to existing methods.

AI-generated summary

We introduce WriteSAE, the first sparse autoencoder that decomposes and edits the matrix cache write of state-space and hybrid recurrent language models, where residual SAEs cannot reach. Existing SAEs read residual streams, but Gated DeltaNet, Mamba-2, and RWKV-7 write to a d_k times d_v cache through rank-1 updates k_t v_t^top that no vector atom can replace. WriteSAE factors each decoder atom into the native write shape, exposes a closed form for the per-token logit shift, and trains under matched Frobenius norm so atoms swap one cache slot at a time. Atom substitution beats matched-norm ablation on 92.4% of n=4{,}851 firings at Qwen3.5-0.8B L9 H4, the 87-atom population test holds at 89.8%, the closed form predicts measured effects at R^2=0.98, and Mamba-2-370M substitutes at 88.1% over 2,500 firings. Sustained three-position installs at 3times lift midrank target-in-continuation from 33.3% to 100% under greedy decoding, the first behavioral install at the matrix-recurrent write site.

Community

Paper author Paper submitter

WriteSAE extends sparse autoencoders to the matrix-recurrent write site by making decoder atoms rank-1 outer products, matching the native k_t v_t^T cache update. The main results are 92.4% atom-substitution win rate over matched-norm ablation at Qwen3.5-0.8B L9 H4, R^2 = 0.98 closed-form logit-shift prediction, 88.1% transfer to Mamba-2-370M, and a sustained install that moves midrank target-in-continuation from 33.3% to 100%.

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2605.12770
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.12770 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.12770 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.