Small Vectors, Big Effects: A Mechanistic Study of RL-Induced Reasoning via Steering Vectors Paper • 2509.06608 • Published Sep 8, 2025
Train One Sparse Autoencoder Across Multiple Sparsity Budgets to Preserve Interpretability and Accuracy Paper • 2505.24473 • Published May 30, 2025
Teach Old SAEs New Domain Tricks with Boosting Paper • 2507.12990 • Published Jul 17, 2025 • 12
Unstable Features, Reproducible Subspaces: Understanding Seed Dependence in Sparse Autoencoders Paper • 2606.12138 • Published 16 days ago • 8
Unstable Features, Reproducible Subspaces: Understanding Seed Dependence in Sparse Autoencoders Paper • 2606.12138 • Published 16 days ago • 8
Interpreting and Steering a Text-to-Speech Language Model with Sparse Autoencoders Paper • 2606.10029 • Published 18 days ago • 12
Running Featured 25 Chasing the Counting Manifold in Open LLMs 📚 25 Counting manifolds in open LLMs from behavior to SAEs.
F-GRPO: Don't Let Your Policy Learn the Obvious and Forget the Rare Paper • 2602.06717 • Published Feb 6 • 75
gudleifrr/sae_Qwen_Qwen2.5-Math-7B_diff_blocks.10.hook_resid_post_16384_batchtopk_64_0.001_1376 Updated Aug 25, 2025
gudleifrr/sae_Qwen_Qwen2.5-Math-7B_diff_blocks.15.hook_resid_post_16384_batchtopk_64_0.001_9715 Updated Aug 25, 2025
gudleifrr/sae_Qwen_Qwen2.5-Math-7B_diff_blocks.10.hook_resid_post_16384_batchtopk_64_0.001_1376 Updated Aug 25, 2025
gudleifrr/sae_Qwen_Qwen2.5-Math-7B_diff_blocks.10.hook_resid_post_16384_batchtopk_64_0.001_3866 Updated Aug 25, 2025
gudleifrr/sae_Qwen_Qwen2.5-Math-7B_diff_blocks.10.hook_resid_post_16384_batchtopk_64_0.001_3866 Updated Aug 25, 2025