These are models and datasets used in "White box control: Evaluating Probes in a Research Sabotage Setting" [arxiv link]