File size: 5,204 Bytes
2a3efd4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
---
library_name: engram
tags:
  - kv-cache
  - fingerprinting
  - fourier
  - retrieval
  - hnsw
  - session-memory
  - cross-model
  - inference
  - mcp
  - llm-memory
license: apache-2.0
language:
  - en
pipeline_tag: feature-extraction
---

# ENGRAM: KV Cache Fingerprinting Protocol

**You Don't Need Adapters: Cross-Model Document Retrieval via Intrinsic KV Cache Geometry**

ENGRAM extracts Fourier fingerprints from LLM KV caches, stores them as compact binary certificates (`.eng` files, ~800 bytes), and retrieves them via HNSW approximate nearest neighbor search. This enables **persistent cross-session memory** for large language models with zero training.

> *By ENIGMA*

## Key Results

| Metric | Value |
|---|---|
| Recall@1 (N=200) | **100.0%** (post Stage-4) |
| Raw Fourier recall | **98.0%** (f0+f1 DFT) |
| HNSW search latency | **51.8 us** |
| HNSW speedup | **5.7x** vs brute-force |
| Cross-model transfer | **+0.124 margin** (FCDB, no adapter) |
| CKA isomorphism | **0.975** within-family, **0.927** cross-family |
| Certificate size | **~800 bytes** per document |
| Architectures | llama, gemma, gemma4/ISWA, phi, qwen, mistral |
| Tests | **220 passing** |

## How It Works

```
KV cache blob --> layer key extraction --> DFT(f0+f1) --> fingerprint (~800 bytes)
                                                              |
Query fingerprint --> HNSW search --> geodesic retrieval --> matched session/document
```

### The Fourier Fingerprint

ENGRAM decomposes per-layer key trajectories using a 2-component DFT:
- **f0** (DC component): captures the mean activation level per layer
- **f1** (first harmonic): captures the dominant oscillation pattern

The resulting fingerprint is a compact, deterministic signature of the KV cache state that is:
- **Model-intrinsic**: derived from the model's own geometry, not learned embeddings
- **Cross-model transferable**: via Frechet Cross-Domain Bridge (FCDB)
- **Compression-robust**: 0.99998 cosine similarity after INT8 quantization

### 4-Stage Geodesic Retrieval

```
Stage 0: Prior preemption     (IndexC chronic failure -> skip HNSW)
Stage 1: HNSW search          -> HIGH / MEDIUM confidence
Stage 2: Trajectory correction -> MEDIUM (interpolation w=0.3)
Stage 3: Negative constraints  -> LOW (apophatic layer)
Stage 4: Metadata disambig     -> LOW + stage4_used=True
```

## Install

```bash
# Python (core library)
pip install engram-kv

# Node.js (MCP client)
npm install engram-kv-mcp
```

### From source

```bash
git clone https://github.com/infraax/engram.git
cd engram
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"

# Run tests
KMP_DUPLICATE_LIB_OK=TRUE OMP_NUM_THREADS=1 PYTHONPATH=. pytest tests/ -x -q
```

## Architecture Support

| Architecture | Attention Type | Status |
|---|---|---|
| Llama (1B-70B) | Standard MHA | Fully supported |
| Gemma (2B-27B) | Standard MHA | Fully supported |
| Gemma 4 (26B) | ISWA (sliding + global) | Fully supported |
| Phi (3.8B) | Standard MHA | Fully supported |
| Qwen (1.8B-72B) | GQA | Fully supported |
| Mistral (7B) | GQA + sliding window | Fully supported |

## Cross-Model Transfer

9 strategies evaluated. **FCDB (Frechet Cross-Domain Bridge)** wins:

| Strategy | Margin | Method |
|---|---|---|
| FCDB | **+0.124** | Frechet mean of cross-model fingerprints |
| TruncAlign | +0.098 | Truncate to min shared layers |
| ZeroPad | +0.067 | Pad shorter fingerprint with zeros |
| SpectralInterp | +0.045 | Interpolate in frequency domain |

No adapter training required. The geometry is intrinsic.

## MCP Server (Claude Code Integration)

ENGRAM includes an MCP server for persistent session memory in Claude Code:

```bash
claude mcp add --global engram-memory \
  -e ENGRAM_SESSIONS_DIR=~/.engram/sessions \
  -- python3 mcp/engram_memory.py
```

**7 tools**: `write_session_engram`, `get_last_session`, `retrieve_relevant_sessions`, `get_relevant_context`, `list_indexed`, `index_knowledge`

## EIGENGRAM Binary Format (v1.2)

Compact, versioned binary certificates:

```
Header:  magic(4B) + version(2B) + flags(2B) + dimensions
Vectors: vec_perdoc + vec_fcdb + joint_center + vec_fourier + vec_fourier_v2
Meta:    corpus_hash + model_id + metrics + task_description
```

~800 bytes per document. Deterministic encoding. Cross-platform portable.

## Theoretical Contributions

1. **Margin Power Law**: margin ~ A * N^alpha where alpha = -0.207 (graceful degradation, no cliff)
2. **CKA Manifold Isomorphism**: within-family 0.975, cross-family 0.927 (geometry is intrinsic)
3. **Frequency Ablation**: f0+f1 is the sweet spot (f0-only: -23% recall, f0+f1+f2: -0.3% margin)
4. **FCDB Scaling Law**: cross-model recall drops from 100% (N<=20) to 0% (N=200) -- adapter-free has limits

## Citation

```bibtex
@article{enigma2026engram,
  title={You Don't Need Adapters: Cross-Model Document Retrieval
         via Intrinsic KV Cache Geometry},
  author={ENIGMA},
  year={2026},
  url={https://github.com/infraax/engram}
}
```

## Links

- [GitHub](https://github.com/infraax/engram)
- [PyPI: engram-kv](https://pypi.org/project/engram-kv/)
- [npm: engram-kv-mcp](https://www.npmjs.com/package/engram-kv-mcp)

## License

Apache-2.0