| --- |
| library_name: engram |
| tags: |
| - kv-cache |
| - fingerprinting |
| - fourier |
| - retrieval |
| - hnsw |
| - session-memory |
| - cross-model |
| - inference |
| - mcp |
| - llm-memory |
| license: apache-2.0 |
| language: |
| - en |
| pipeline_tag: feature-extraction |
| --- |
| |
| # ENGRAM: KV Cache Fingerprinting Protocol |
|
|
| **You Don't Need Adapters: Cross-Model Document Retrieval via Intrinsic KV Cache Geometry** |
|
|
| ENGRAM extracts Fourier fingerprints from LLM KV caches, stores them as compact binary certificates (`.eng` files, ~800 bytes), and retrieves them via HNSW approximate nearest neighbor search. This enables **persistent cross-session memory** for large language models with zero training. |
|
|
| > *By ENIGMA* |
|
|
| ## Key Results |
|
|
| | Metric | Value | |
| |---|---| |
| | Recall@1 (N=200) | **100.0%** (post Stage-4) | |
| | Raw Fourier recall | **98.0%** (f0+f1 DFT) | |
| | HNSW search latency | **51.8 us** | |
| | HNSW speedup | **5.7x** vs brute-force | |
| | Cross-model transfer | **+0.124 margin** (FCDB, no adapter) | |
| | CKA isomorphism | **0.975** within-family, **0.927** cross-family | |
| | Certificate size | **~800 bytes** per document | |
| | Architectures | llama, gemma, gemma4/ISWA, phi, qwen, mistral | |
| | Tests | **220 passing** | |
|
|
| ## How It Works |
|
|
| ``` |
| KV cache blob --> layer key extraction --> DFT(f0+f1) --> fingerprint (~800 bytes) |
| | |
| Query fingerprint --> HNSW search --> geodesic retrieval --> matched session/document |
| ``` |
|
|
| ### The Fourier Fingerprint |
|
|
| ENGRAM decomposes per-layer key trajectories using a 2-component DFT: |
| - **f0** (DC component): captures the mean activation level per layer |
| - **f1** (first harmonic): captures the dominant oscillation pattern |
|
|
| The resulting fingerprint is a compact, deterministic signature of the KV cache state that is: |
| - **Model-intrinsic**: derived from the model's own geometry, not learned embeddings |
| - **Cross-model transferable**: via Frechet Cross-Domain Bridge (FCDB) |
| - **Compression-robust**: 0.99998 cosine similarity after INT8 quantization |
|
|
| ### 4-Stage Geodesic Retrieval |
|
|
| ``` |
| Stage 0: Prior preemption (IndexC chronic failure -> skip HNSW) |
| Stage 1: HNSW search -> HIGH / MEDIUM confidence |
| Stage 2: Trajectory correction -> MEDIUM (interpolation w=0.3) |
| Stage 3: Negative constraints -> LOW (apophatic layer) |
| Stage 4: Metadata disambig -> LOW + stage4_used=True |
| ``` |
|
|
| ## Install |
|
|
| ```bash |
| # Python (core library) |
| pip install engram-kv |
| |
| # Node.js (MCP client) |
| npm install engram-kv-mcp |
| ``` |
|
|
| ### From source |
|
|
| ```bash |
| git clone https://github.com/infraax/engram.git |
| cd engram |
| python3 -m venv .venv && source .venv/bin/activate |
| pip install -e ".[dev]" |
| |
| # Run tests |
| KMP_DUPLICATE_LIB_OK=TRUE OMP_NUM_THREADS=1 PYTHONPATH=. pytest tests/ -x -q |
| ``` |
|
|
| ## Architecture Support |
|
|
| | Architecture | Attention Type | Status | |
| |---|---|---| |
| | Llama (1B-70B) | Standard MHA | Fully supported | |
| | Gemma (2B-27B) | Standard MHA | Fully supported | |
| | Gemma 4 (26B) | ISWA (sliding + global) | Fully supported | |
| | Phi (3.8B) | Standard MHA | Fully supported | |
| | Qwen (1.8B-72B) | GQA | Fully supported | |
| | Mistral (7B) | GQA + sliding window | Fully supported | |
|
|
| ## Cross-Model Transfer |
|
|
| 9 strategies evaluated. **FCDB (Frechet Cross-Domain Bridge)** wins: |
|
|
| | Strategy | Margin | Method | |
| |---|---|---| |
| | FCDB | **+0.124** | Frechet mean of cross-model fingerprints | |
| | TruncAlign | +0.098 | Truncate to min shared layers | |
| | ZeroPad | +0.067 | Pad shorter fingerprint with zeros | |
| | SpectralInterp | +0.045 | Interpolate in frequency domain | |
|
|
| No adapter training required. The geometry is intrinsic. |
|
|
| ## MCP Server (Claude Code Integration) |
|
|
| ENGRAM includes an MCP server for persistent session memory in Claude Code: |
|
|
| ```bash |
| claude mcp add --global engram-memory \ |
| -e ENGRAM_SESSIONS_DIR=~/.engram/sessions \ |
| -- python3 mcp/engram_memory.py |
| ``` |
|
|
| **7 tools**: `write_session_engram`, `get_last_session`, `retrieve_relevant_sessions`, `get_relevant_context`, `list_indexed`, `index_knowledge` |
|
|
| ## EIGENGRAM Binary Format (v1.2) |
|
|
| Compact, versioned binary certificates: |
|
|
| ``` |
| Header: magic(4B) + version(2B) + flags(2B) + dimensions |
| Vectors: vec_perdoc + vec_fcdb + joint_center + vec_fourier + vec_fourier_v2 |
| Meta: corpus_hash + model_id + metrics + task_description |
| ``` |
|
|
| ~800 bytes per document. Deterministic encoding. Cross-platform portable. |
|
|
| ## Theoretical Contributions |
|
|
| 1. **Margin Power Law**: margin ~ A * N^alpha where alpha = -0.207 (graceful degradation, no cliff) |
| 2. **CKA Manifold Isomorphism**: within-family 0.975, cross-family 0.927 (geometry is intrinsic) |
| 3. **Frequency Ablation**: f0+f1 is the sweet spot (f0-only: -23% recall, f0+f1+f2: -0.3% margin) |
| 4. **FCDB Scaling Law**: cross-model recall drops from 100% (N<=20) to 0% (N=200) -- adapter-free has limits |
|
|
| ## Citation |
|
|
| ```bibtex |
| @article{enigma2026engram, |
| title={You Don't Need Adapters: Cross-Model Document Retrieval |
| via Intrinsic KV Cache Geometry}, |
| author={ENIGMA}, |
| year={2026}, |
| url={https://github.com/infraax/engram} |
| } |
| ``` |
|
|
| ## Links |
|
|
| - [GitHub](https://github.com/infraax/engram) |
| - [PyPI: engram-kv](https://pypi.org/project/engram-kv/) |
| - [npm: engram-kv-mcp](https://www.npmjs.com/package/engram-kv-mcp) |
|
|
| ## License |
|
|
| Apache-2.0 |
|
|