Geometric Stability: The Missing Axis of Representations
Abstract
Geometric stability measures representational robustness under perturbation, offering complementary insights to similarity metrics in analyzing learned representations across diverse domains.
Analysis of learned representations has a blind spot: it focuses on similarity, measuring how closely embeddings align with external references, but similarity reveals only what is represented, not whether that structure is robust. We introduce geometric stability, a distinct dimension that quantifies how reliably representational geometry holds under perturbation, and present Shesha, a framework for measuring it. Across 2,463 configurations in seven domains, we show that stability and similarity are empirically uncorrelated (ρapprox 0.01) and mechanistically distinct: similarity metrics collapse after removing the top principal components, while stability retains sensitivity to fine-grained manifold structure. This distinction yields actionable insights: for safety monitoring, stability acts as a functional geometric canary, detecting structural drift nearly 2times more sensitively than CKA while filtering out the non-functional noise that triggers false alarms in rigid distance metrics; for controllability, supervised stability predicts linear steerability (ρ= 0.89-0.96); for model selection, stability dissociates from transferability, revealing a geometric tax that transfer optimization incurs. Beyond machine learning, stability predicts CRISPR perturbation coherence and neural-behavioral coupling. By quantifying how reliably systems maintain structure, geometric stability provides a necessary complement to similarity for auditing representations across biological and computational systems.
Community
DeepSeek got it half right with their mHC paper: stability matters for scaling. But they only measure stability DURING training.
What about the stability of what models LEARN?
I built Shesha to measure this - a geometric stability metric with SOTA results across AI Safety, Constitutional AI, Model selection, and CRISPR perturbation analysis.
The core insight: Most evals check external similarity (does output X match Y?). But imagine a massive library where someone reshuffled all the books. A content-based audit would say that nothing's wrong since the inventory is identical. But the library is useless since nothing can be found. That's the gap Shesha fills.
The implications are broad with SOTA results across 4 domains:
AI Safety - Shesha is the best canary in the coal mine. Shesha outperforms CKA and Procrustes on drift detection. Detects 2x more drift than CKA, triggers earlier 73% of the time, catches subtle LoRA shifts at 90% sensitivity (5% FPR) - with only 7% false alarms vs Procrustes' 44%.
Constitutional AI - Shesha provides the best steering prediction. Constitutional AI needs models you can actually steer. Most metrics ask: "Are classes separable?" Wrong question! Shesha asks: "Is that separation STABLE under perturbation?" Tested on 35-69 embedding models across 3 experiments. Shesha outperforms Fisher discriminant, silhouette score, Procrustes, and anisotropy. The correlations with intervention success are rho=0.89-0.96, and the partial correlations after controlling for separability are rho=0.67-0.76. Stability ≠ separability.
Model selection - Shesha exposes what LogME misses. The DINO Paradox - best transfer scores, worst geometric stability. Tested 94 vision models on 6 datasets: DINOv2 ranked #1 in LogME on 4/6 datasets but last or near last in stability on 5/6. SOTA transfer incurs a "geometric tax."
CRISPR perturbations - Shesha serves as a new filter for target selection. CRISPR screens find hits by magnitude, but magnitude alone can't distinguish clean lineage drivers from promiscuous regulators. Shesha adds precision. Tested on 811 perturbations, Shesha showed uniformly positive magnitude-stability correlations ranging from rho=0.746 in high-variance screens to rho=0.963 in cleaner activation settings. Notably, in discordant cases, Shesha separates KLF1 (stable, specific) from CEBPA (strong but messy) purely from geometry.
Additional validation in neuroscience: Geometric stability predicted neural-behavioral coupling (rho=0.18, p=0.005) in Neuropixels data. Centroid drift showed no relationship (rho=0.00). Stability ≠ consistency.
Try it yourself:
PyPI: pip install shesha-geometry
Tutorials: https://github.com/prashantcraju/shesha?tab=readme-ov-file#tutorials
Preprint: https://arxiv.org/abs/2601.09173
Code: https://github.com/prashantcraju/geometric-stability
Models citing this paper 1
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper