Faithfulness Metrics Don't Measure Faithfulness: A Meta-Evaluation with Ground Truth Paper • 2605.25052 • Published 3 days ago • 7
From Directions to Regions: Decomposing Activations in Language Models via Local Geometry Paper • 2602.02464 • Published Feb 2 • 3
Mixing Mechanisms: How Language Models Retrieve Bound Entities In-Context Paper • 2510.06182 • Published Oct 7, 2025 • 9
LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations Paper • 2509.03405 • Published Sep 3, 2025 • 24
🔍 Interpretability & Analysis of LMs Collection Outstanding research in LM interpretability and evaluation, summarized • 136 items • Updated about 14 hours ago • 119