-
Resa: Transparent Reasoning Models via SAEs
Paper • 2506.09967 • Published • 21 -
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2
Paper • 2408.05147 • Published • 40 -
Train Sparse Autoencoders Efficiently by Utilizing Features Correlation
Paper • 2505.22255 • Published • 24 -
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders
Paper • 2503.18878 • Published • 119
Collections
Discover the best community collections!
Collections including paper arxiv:2506.09967
-
s3: You Don't Need That Much Data to Train a Search Agent via RL
Paper • 2505.14146 • Published • 19 -
Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications of Agentic AI
Paper • 2505.19443 • Published • 15 -
ARM: Adaptive Reasoning Model
Paper • 2505.20258 • Published • 45 -
Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles
Paper • 2505.19914 • Published • 43
-
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders
Paper • 2503.18878 • Published • 119 -
Truth Neurons
Paper • 2505.12182 • Published • 8 -
Resa: Transparent Reasoning Models via SAEs
Paper • 2506.09967 • Published • 21 -
Why Can't Transformers Learn Multiplication? Reverse-Engineering Reveals Long-Range Dependency Pitfalls
Paper • 2510.00184 • Published • 16
-
Adapt-Pruner: Adaptive Structural Pruning for Efficient Small Language Model Training
Paper • 2502.03460 • Published -
LLM-Pruner: On the Structural Pruning of Large Language Models
Paper • 2305.11627 • Published • 3 -
Pruning as a Domain-specific LLM Extractor
Paper • 2405.06275 • Published • 1 -
KnowTuning: Knowledge-aware Fine-tuning for Large Language Models
Paper • 2402.11176 • Published • 2
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 14 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 60 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 48
-
Resa: Transparent Reasoning Models via SAEs
Paper • 2506.09967 • Published • 21 -
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2
Paper • 2408.05147 • Published • 40 -
Train Sparse Autoencoders Efficiently by Utilizing Features Correlation
Paper • 2505.22255 • Published • 24 -
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders
Paper • 2503.18878 • Published • 119
-
Adapt-Pruner: Adaptive Structural Pruning for Efficient Small Language Model Training
Paper • 2502.03460 • Published -
LLM-Pruner: On the Structural Pruning of Large Language Models
Paper • 2305.11627 • Published • 3 -
Pruning as a Domain-specific LLM Extractor
Paper • 2405.06275 • Published • 1 -
KnowTuning: Knowledge-aware Fine-tuning for Large Language Models
Paper • 2402.11176 • Published • 2
-
s3: You Don't Need That Much Data to Train a Search Agent via RL
Paper • 2505.14146 • Published • 19 -
Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications of Agentic AI
Paper • 2505.19443 • Published • 15 -
ARM: Adaptive Reasoning Model
Paper • 2505.20258 • Published • 45 -
Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles
Paper • 2505.19914 • Published • 43
-
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders
Paper • 2503.18878 • Published • 119 -
Truth Neurons
Paper • 2505.12182 • Published • 8 -
Resa: Transparent Reasoning Models via SAEs
Paper • 2506.09967 • Published • 21 -
Why Can't Transformers Learn Multiplication? Reverse-Engineering Reveals Long-Range Dependency Pitfalls
Paper • 2510.00184 • Published • 16
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 14 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 60 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 48