YAML Metadata Warning: empty or missing yaml metadata in repo card
Check out the documentation for more information.
_____ _____ _____ _____
/ ____/ ___ / _ \/ _ \ COMB
/ / / / / / / / / /_/ / Chain-Ordered Memory Base
/ /___/ /_/ / / / / ___ /
\____/\___/_/ /_/_/ \_\ commit.
A 200K context window has everything but finds nothing.
COMB has everything and finds what matters.
Quick Start โข The Honeycomb โข Blink Pattern โข API Reference โข CLI โข Architecture
The Problem
Every AI memory system today does the same thing: summarize, compress, embed, discard. Information dies at every step. The user's exact phrasing โ gone. The nuance of a disagreement โ flattened. The specific numbers discussed three sessions ago โ vanished into a centroid.
COMB takes the opposite approach: keep everything, find what matters.
Three lines to remember everything. Zero dependencies. Pure Python.
store = CombStore("./memory")
store.stage("the full conversation")
store.rollup()
Quick Start
Installation
pip install comb-db
# With CLI support
pip install comb-db[cli]
Requirements: Python 3.10+ ยท Zero runtime dependencies (stdlib only)
Usage
from comb import CombStore
# Point at a directory. That's your entire database.
store = CombStore("./my-memory")
# Stage conversations throughout the day (Tier 2 โ append-only)
store.stage("User asked about RSA key exchange. Explained PKCS#1 v2.1...")
store.stage("Follow-up: user wants Ed25519 instead. Discussed tradeoffs.")
# Roll up into the permanent archive (Tier 2 โ Tier 3)
doc = store.rollup()
# โ SHA-256 hash-chained, semantic + social links computed automatically
# Search
results = store.search("encryption")
for r in results:
print(f"{r.date} score={r.similarity_score:.4f}")
# Navigate the honeycomb graph
doc = store.get("2026-02-17")
print(doc.temporal.prev) # previous day
print(doc.temporal.next) # next day
print(doc.semantic.neighbors) # similar conversations
print(doc.social.strengthened) # deepening relationships
print(doc.social.cooled) # cooling relationships
# Walk the graph
for d in store.walk("2026-02-01", direction="semantic", depth=10):
print(d.date, d.content[:80])
# Verify chain integrity โ any tampering breaks the chain
assert store.verify_chain()
# Stats
print(store.stats())
# โ {"document_count": 42, "chain_valid": True, "semantic_links": 180, ...}
Why COMB
| Approach | What you lose | |
|---|---|---|
| ๐๏ธ | Summarization | Exact phrasing, detail, nuance |
| ๐งฎ | Vector embeddings | Keyword precision, temporal order |
| ๐ | Sliding window | Everything outside the window |
| ๐ | COMB | Nothing |
| Feature | Detail |
|---|---|
| ๐ Lossless | Full conversation text, always recoverable |
| โ๏ธ Hash-chained | SHA-256 tamper-evident chain โ blockchain-grade integrity |
| ๐ Three-directional links | Navigate by time, by meaning, or by relationship |
| ๐ Built-in BM25 | Full-text search, zero dependencies, pluggable backend |
| ๐ Serverless | No database, no server โ just JSON files in a directory |
| ๐ Schema-on-read | Your data, your interpretation |
| ๐ชถ Lightweight | ~1,300 lines of pure Python. Nothing to configure. |
The Honeycomb
Every archived document lives in a three-directional graph:
TEMPORAL โโโโ chronological chain (prev/next, hash-linked)
SEMANTIC โโโโ content similarity (TF cosine, top-k neighbors)
SOCIAL โโโโ relationship gradient (warming โ cooling)
โ๏ธ Temporal Links
A doubly-linked chronological chain. Each document points to the previous and next day. Hash-linked via SHA-256 โ if any document is modified, all subsequent hashes become invalid.
doc = store.get("2026-02-17")
doc.temporal.prev # "2026-02-16"
doc.temporal.next # "2026-02-18"
doc.hash # "a3f8c2..." (SHA-256 of prev_hash + content)
doc.prev_hash # "7d1b9e..." (previous document's hash)
๐ง Semantic Links
Cosine similarity over term-frequency vectors, computed automatically during rollup. The top-k most similar documents are linked as neighbors.
for neighbor in doc.semantic.neighbors:
print(f"{neighbor.target}: {neighbor.score:.4f}")
# 2026-02-10: 0.8721 โ discussed same topic 7 days ago
# 2026-02-03: 0.6534 โ related conversation 2 weeks ago
๐ Social Links
The novel part. Conversations have relational temperature. COMB tracks engagement patterns and sentiment shifts between documents:
- Inward fade (strengthening) โ engagement increasing, sentiment warming
- Outward fade (cooling) โ engagement decreasing, sentiment cooling
for link in doc.social.strengthened:
print(f"โ {link.target}: +{link.delta:.4f}")
for link in doc.social.cooled:
print(f"โ {link.target}: {link.delta:.4f}")
This lets an agent understand not just what was discussed, but how the relationship evolved.
The Blink Pattern
Seamless agent restarts with zero context loss. The agent doesn't die and resurrect โ it blinks.
# โโโ Before restart: flush everything โโโ
store.blink("""
Active project: step 3500/100K, loss 4.85
Decision: deferred publish until 80% milestone
Pending: PR #42 review, cron job for monitoring
""")
# โโโ After restart: wake up where you left off โโโ
context = store.recall()
# โ staged entries (most recent first) + archived history
The blink() method stages operational context with a "blink": True metadata flag. The recall() method reconstructs it โ staged entries first, then recent archive documents.
# Blink with auto-rollup into permanent archive
store.blink(context, rollup=True)
# Recall with custom depth
context = store.recall(k=10, include_staged=True)
Not death and resurrection. Just a blink. See docs/blink.md for the full pattern.
API Reference
CombStore(path, *, search_backend=None)
Create or open a COMB store.
| Parameter | Type | Description |
|---|---|---|
path |
str | Path |
Directory for all COMB data. Created if missing. |
search_backend |
SearchBackend | None |
Custom search backend. Defaults to built-in BM25. |
Staging (Tier 2)
store.stage(text, *, metadata=None, date=None)
Append a conversation dump to today's staging file.
store.rollup(date=None) โ CombDocument | None
Promote staged data to the chain archive. Concatenates all staged entries, computes hash chain + honeycomb links, indexes for search, clears staging.
Search
store.search(query, *, mode="bm25", k=5) โ list[CombDocument]
Search the archive. Returns documents with similarity_score set.
| Mode | Description |
|---|---|
"bm25" |
Built-in BM25 full-text search (default) |
"semantic" |
Requires custom SearchBackend |
"hybrid" |
Requires custom SearchBackend |
Navigation
store.get(date) โ CombDocument | None
Retrieve an archived document by date string (YYYY-MM-DD).
store.walk(start, *, direction="temporal", depth=100) โ Iterator[CombDocument]
Walk the honeycomb graph from a starting date.
| Direction | Behavior |
|---|---|
"temporal" |
Follow prev/next chain forward |
"semantic" |
Breadth-first through semantic neighbors |
Blink
store.blink(text, *, metadata=None, rollup=False) โ str
Stage operational context and return recall preview. The pre-restart half.
store.recall(*, k=5, include_staged=True) โ str
Reconstruct operational context. The post-restart half.
Integrity
store.verify_chain() โ bool
Verify the entire SHA-256 hash chain. Returns True if intact.
store.stats() โ dict
Returns document_count, total_bytes, chain_length, semantic_links, social_links, staged_dates, chain_valid.
CombDocument
| Field | Type | Description |
|---|---|---|
date |
str |
Date string (YYYY-MM-DD) |
content |
str |
Full archived text |
hash |
str |
SHA-256 chain hash |
prev_hash |
str | None |
Previous document's hash |
metadata |
dict |
Arbitrary metadata (byte_size, total_tokens, created_at, ...) |
temporal |
TemporalLinks |
.prev, .next โ date strings |
semantic |
SemanticLinks |
.neighbors โ list of SemanticNeighbor(target, score) |
social |
SocialLinks |
.links, .strengthened, .cooled โ list of SocialLink(target, direction, delta) |
similarity_score |
float | None |
Set during search (transient, not persisted) |
SearchBackend Protocol
Plug in any search engine:
from comb import SearchBackend
class MyVectorSearch:
def index(self, doc_id: str, text: str) -> None:
# Index a document (doc_id is typically a date string)
...
def search(self, query: str, k: int = 5) -> list[tuple[str, float]]:
# Return [(doc_id, score), ...] sorted by descending relevance
...
store = CombStore("./memory", search_backend=MyVectorSearch())
CLI
Requires pip install comb-db[cli] (adds click dependency).
# Stage a conversation
echo "Today's conversation..." | comb -s ./memory stage
comb -s ./memory stage -f transcript.txt
comb -s ./memory stage "inline text"
# Roll up into archive
comb -s ./memory rollup
# Search
comb -s ./memory search "encryption"
# Blink pattern
echo "operational context" | comb -s ./memory blink
comb -s ./memory blink -f context.md --rollup
comb -s ./memory recall
# Inspect
comb -s ./memory show 2026-02-17
comb -s ./memory stats
comb -s ./memory verify
The COMB_STORE environment variable can replace -s:
export COMB_STORE=./memory
comb stage "text"
comb search "query"
Architecture
โโโโโโโโโโโ
โฑโฒ โ Tier 1 โ Agent's context window
โฑ โฒ โ Active โ (not managed by COMB)
โฑ โฒ โโโโโโโโโโโ
โฑ โฒ
โโโโโโโโฑโโโโโโโโโฒโโโโโโโ
โ Tier 2 โ Append-only JSONL per day
โ Daily Staging โ staging/YYYY-MM-DD.jsonl
โ โ
โโโโโโโโโโโโฌโโโโโโโโโโโโ
โ rollup()
โโโโโโโโโโโโผโโโโโโโโโโโโ
โ Tier 3 โ One JSON document per day
โ Chain Archive โ SHA-256 hash-chained
โ + Honeycomb Links โ Temporal + Semantic + Social
โโโโโโโโโโโโโโโโโโโโโโโโ
Storage Layout
my-memory/
โโโ staging/
โ โโโ 2026-03-02.jsonl # today's staged entries (append-only)
โโโ archive/
โโโ 2026-02-28.json # hash-chained document with honeycomb links
โโโ 2026-03-01.json
โโโ 2026-03-02.json
Everything is JSON. Human-readable. No binary formats. Copy the directory, copy the memory.
Source Layout
comb/
โโโ core.py # CombStore โ main interface (333 lines)
โโโ document.py # CombDocument + link types (164 lines)
โโโ archive.py # ChainArchive โ hash-chained Tier 3 (171 lines)
โโโ staging.py # DailyStaging โ append-only Tier 2 (104 lines)
โโโ honeycomb.py # HoneycombGraph โ three-directional links (166 lines)
โโโ search.py # BM25Search + SearchBackend protocol (137 lines)
โโโ cli.py # Click CLI (155 lines)
โโโ _utils.py # SHA-256, tokenizer, sentiment (68 lines)
~1,300 lines total. 45 tests.
What COMB Is โ and Isn't
Is:
- A lossless archival system for AI conversation history
- A tamper-evident hash chain with three-directional graph navigation
- A zero-dependency library that runs anywhere Python runs
- The memory persistence layer your agent is missing
Isn't:
- Not a vector database (but you can plug one in)
- Not a summarization tool (that's the point)
- Not a real-time streaming system
- Not a replacement for your context window โ it's the archive behind it
Lineage
COMB descends from HYBRIDbee, a serverless document database. Same philosophy: schema-on-read, single-directory storage, zero configuration.
Links
| ๐ฆ PyPI | pypi.org/project/comb-db |
| ๐ฆ Source | gitlab.com/the-lab9951109/comb |
| ๐ค HuggingFace | huggingface.co/ava-shakil/comb |
| ๐ Blog | artifactvirtual.substack.com |
| ๐ Docs | docs/ |
License
MIT โ do whatever you want.
Built by Ava Shakil at Artifact Virtual
commit. ๐ฎ