STRIDE - Training Data Attribution
Collection
3 items • Updated
Trained STRIDE steering operators and the nanochat base checkpoints they attribute, for the four pre-training models in the paper. STRIDE attributes a model's prediction back to the pre-training examples that shaped it, by learning a tiny activation-steering operator instead of retraining.
| depth | params | base ckpt step | operator | LDS (Spearman) |
|---|---|---|---|---|
| d12 | 286M | 1680 | layer 8, rank 16 | 0.156 |
| d16 | 537M | 3584 | layer 10, rank 16 | 0.177 |
| d20 | 897M | 3320 | layer 12, rank 16 | 0.158 |
| d24 | 1.38B | 5568 | layer 15, rank 16 | 0.165 |
base_checkpoints/<tag>/model_<step>.pt # nanochat base checkpoint
base_checkpoints/<tag>/meta_<step>.json # nanochat config
operators/<tag>/operator.pt # trained SteeringOperator state dict
operators/<tag>/subsets.npy # subset membership (K=1000, d=10)
operators/<tag>/meta.json # training config + n_train + operator dims
tokenizer/ # shared nanochat tokenizer
from stride.inference import Stride
attr = Stride.from_pretrained("d12")
result = attr.attribute(my_queries)
The LDS ground-truth losses and the held-out test set live in the dataset repo rishitdagli/stride-lds.
@misc{dagli2026stridetrainingdataattribution,
title={STRIDE: Training Data Attribution via Sparse Recovery from Subset Perturbations},
author={Rishit Dagli and Abir Harrasse and Luke Zhang and Florent Draye and Amirali Abdullah and Bernhard Schölkopf and Zhijing Jin},
year={2026},
eprint={2606.05165},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2606.05165},
}