10 1

Gleb Gerasimov

gudleifrr

humdinger-g

AI & ML interests

NLP, interpretability

Recent Activity

authored a paper 10 days ago

Small Vectors, Big Effects: A Mechanistic Study of RL-Induced Reasoning via Steering Vectors

authored a paper 10 days ago

Train One Sparse Autoencoder Across Multiple Sparsity Budgets to Preserve Interpretability and Accuracy

authored a paper 10 days ago

Teach Old SAEs New Domain Tricks with Boosting

View all activity

Organizations

None yet

authored 4 papers 10 days ago

Small Vectors, Big Effects: A Mechanistic Study of RL-Induced Reasoning via Steering Vectors

Paper • 2509.06608 • Published Sep 8, 2025

Train One Sparse Autoencoder Across Multiple Sparsity Budgets to Preserve Interpretability and Accuracy

Paper • 2505.24473 • Published May 30, 2025

Teach Old SAEs New Domain Tricks with Boosting

Paper • 2507.12990 • Published Jul 17, 2025 • 12

Unstable Features, Reproducible Subspaces: Understanding Seed Dependence in Sparse Autoencoders

Paper • 2606.12138 • Published 16 days ago • 8

upvoted a paper 10 days ago

Unstable Features, Reproducible Subspaces: Understanding Seed Dependence in Sparse Autoencoders

Paper • 2606.12138 • Published 16 days ago • 8

upvoted a paper 16 days ago

Interpreting and Steering a Text-to-Speech Language Model with Sparse Autoencoders

Paper • 2606.10029 • Published 18 days ago • 12

liked a Space 4 months ago

Chasing the Counting Manifold in Open LLMs

📚

Counting manifolds in open LLMs from behavior to SAEs.

upvoted a paper 5 months ago

F-GRPO: Don't Let Your Policy Learn the Obvious and Forget the Rare

Paper • 2602.06717 • Published Feb 6 • 75

updated a model 6 months ago

gudleifrr/gpt2_saes

Updated Jan 9

published a model 6 months ago

gudleifrr/gpt2_saes

Updated Jan 9

published a dataset 6 months ago

gudleifrr/gpt2_saes

Updated Jan 9 • 3

updated a dataset 7 months ago

gudleifrr/interpretations

Updated Dec 4, 2025 • 5

published a dataset 7 months ago

gudleifrr/interpretations

Updated Dec 4, 2025 • 5

updated a dataset 10 months ago

gudleifrr/OpenThoughts-114k-full-fix

Viewer • Updated Sep 9, 2025 • 114k • 7

published a dataset 10 months ago

gudleifrr/OpenThoughts-114k-full-fix

Viewer • Updated Sep 9, 2025 • 114k • 7

updated a model 10 months ago

gudleifrr/sae_Qwen_Qwen2.5-Math-7B_diff_blocks.10.hook_resid_post_16384_batchtopk_64_0.001_1376

Updated Aug 25, 2025

published 2 models 10 months ago

gudleifrr/sae_Qwen_Qwen2.5-Math-7B_diff_blocks.15.hook_resid_post_16384_batchtopk_64_0.001_9715

Updated Aug 25, 2025

gudleifrr/sae_Qwen_Qwen2.5-Math-7B_diff_blocks.10.hook_resid_post_16384_batchtopk_64_0.001_1376

Updated Aug 25, 2025

updated a model 10 months ago

gudleifrr/sae_Qwen_Qwen2.5-Math-7B_diff_blocks.10.hook_resid_post_16384_batchtopk_64_0.001_3866

Updated Aug 25, 2025

published a model 10 months ago

gudleifrr/sae_Qwen_Qwen2.5-Math-7B_diff_blocks.10.hook_resid_post_16384_batchtopk_64_0.001_3866

Updated Aug 25, 2025

Gleb Gerasimov

AI & ML interests

Recent Activity

Organizations

gudleifrr's activity

Chasing the Counting Manifold in Open LLMs