File size: 5,204 Bytes
2a3efd4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 | ---
library_name: engram
tags:
- kv-cache
- fingerprinting
- fourier
- retrieval
- hnsw
- session-memory
- cross-model
- inference
- mcp
- llm-memory
license: apache-2.0
language:
- en
pipeline_tag: feature-extraction
---
# ENGRAM: KV Cache Fingerprinting Protocol
**You Don't Need Adapters: Cross-Model Document Retrieval via Intrinsic KV Cache Geometry**
ENGRAM extracts Fourier fingerprints from LLM KV caches, stores them as compact binary certificates (`.eng` files, ~800 bytes), and retrieves them via HNSW approximate nearest neighbor search. This enables **persistent cross-session memory** for large language models with zero training.
> *By ENIGMA*
## Key Results
| Metric | Value |
|---|---|
| Recall@1 (N=200) | **100.0%** (post Stage-4) |
| Raw Fourier recall | **98.0%** (f0+f1 DFT) |
| HNSW search latency | **51.8 us** |
| HNSW speedup | **5.7x** vs brute-force |
| Cross-model transfer | **+0.124 margin** (FCDB, no adapter) |
| CKA isomorphism | **0.975** within-family, **0.927** cross-family |
| Certificate size | **~800 bytes** per document |
| Architectures | llama, gemma, gemma4/ISWA, phi, qwen, mistral |
| Tests | **220 passing** |
## How It Works
```
KV cache blob --> layer key extraction --> DFT(f0+f1) --> fingerprint (~800 bytes)
|
Query fingerprint --> HNSW search --> geodesic retrieval --> matched session/document
```
### The Fourier Fingerprint
ENGRAM decomposes per-layer key trajectories using a 2-component DFT:
- **f0** (DC component): captures the mean activation level per layer
- **f1** (first harmonic): captures the dominant oscillation pattern
The resulting fingerprint is a compact, deterministic signature of the KV cache state that is:
- **Model-intrinsic**: derived from the model's own geometry, not learned embeddings
- **Cross-model transferable**: via Frechet Cross-Domain Bridge (FCDB)
- **Compression-robust**: 0.99998 cosine similarity after INT8 quantization
### 4-Stage Geodesic Retrieval
```
Stage 0: Prior preemption (IndexC chronic failure -> skip HNSW)
Stage 1: HNSW search -> HIGH / MEDIUM confidence
Stage 2: Trajectory correction -> MEDIUM (interpolation w=0.3)
Stage 3: Negative constraints -> LOW (apophatic layer)
Stage 4: Metadata disambig -> LOW + stage4_used=True
```
## Install
```bash
# Python (core library)
pip install engram-kv
# Node.js (MCP client)
npm install engram-kv-mcp
```
### From source
```bash
git clone https://github.com/infraax/engram.git
cd engram
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
# Run tests
KMP_DUPLICATE_LIB_OK=TRUE OMP_NUM_THREADS=1 PYTHONPATH=. pytest tests/ -x -q
```
## Architecture Support
| Architecture | Attention Type | Status |
|---|---|---|
| Llama (1B-70B) | Standard MHA | Fully supported |
| Gemma (2B-27B) | Standard MHA | Fully supported |
| Gemma 4 (26B) | ISWA (sliding + global) | Fully supported |
| Phi (3.8B) | Standard MHA | Fully supported |
| Qwen (1.8B-72B) | GQA | Fully supported |
| Mistral (7B) | GQA + sliding window | Fully supported |
## Cross-Model Transfer
9 strategies evaluated. **FCDB (Frechet Cross-Domain Bridge)** wins:
| Strategy | Margin | Method |
|---|---|---|
| FCDB | **+0.124** | Frechet mean of cross-model fingerprints |
| TruncAlign | +0.098 | Truncate to min shared layers |
| ZeroPad | +0.067 | Pad shorter fingerprint with zeros |
| SpectralInterp | +0.045 | Interpolate in frequency domain |
No adapter training required. The geometry is intrinsic.
## MCP Server (Claude Code Integration)
ENGRAM includes an MCP server for persistent session memory in Claude Code:
```bash
claude mcp add --global engram-memory \
-e ENGRAM_SESSIONS_DIR=~/.engram/sessions \
-- python3 mcp/engram_memory.py
```
**7 tools**: `write_session_engram`, `get_last_session`, `retrieve_relevant_sessions`, `get_relevant_context`, `list_indexed`, `index_knowledge`
## EIGENGRAM Binary Format (v1.2)
Compact, versioned binary certificates:
```
Header: magic(4B) + version(2B) + flags(2B) + dimensions
Vectors: vec_perdoc + vec_fcdb + joint_center + vec_fourier + vec_fourier_v2
Meta: corpus_hash + model_id + metrics + task_description
```
~800 bytes per document. Deterministic encoding. Cross-platform portable.
## Theoretical Contributions
1. **Margin Power Law**: margin ~ A * N^alpha where alpha = -0.207 (graceful degradation, no cliff)
2. **CKA Manifold Isomorphism**: within-family 0.975, cross-family 0.927 (geometry is intrinsic)
3. **Frequency Ablation**: f0+f1 is the sweet spot (f0-only: -23% recall, f0+f1+f2: -0.3% margin)
4. **FCDB Scaling Law**: cross-model recall drops from 100% (N<=20) to 0% (N=200) -- adapter-free has limits
## Citation
```bibtex
@article{enigma2026engram,
title={You Don't Need Adapters: Cross-Model Document Retrieval
via Intrinsic KV Cache Geometry},
author={ENIGMA},
year={2026},
url={https://github.com/infraax/engram}
}
```
## Links
- [GitHub](https://github.com/infraax/engram)
- [PyPI: engram-kv](https://pypi.org/project/engram-kv/)
- [npm: engram-kv-mcp](https://www.npmjs.com/package/engram-kv-mcp)
## License
Apache-2.0
|