YAML Metadata Warning: empty or missing yaml metadata in repo card

Check out the documentation for more information.

     _____ _____ _____ _____
    / ____/ ___ / _  \/ _  \     COMB
   / /   / / / / / / / /_/ /     Chain-Ordered Memory Base
  / /___/ /_/ / / / / ___ /
  \____/\___/_/ /_/_/   \_\      commit.
  

A 200K context window has everything but finds nothing.
COMB has everything and finds what matters.

PyPI version Python 3.10+ Zero dependencies SHA-256 hash chain JSON storage GitLab HuggingFace MIT License

Quick Start โ€ข The Honeycomb โ€ข Blink Pattern โ€ข API Reference โ€ข CLI โ€ข Architecture


The Problem

Every AI memory system today does the same thing: summarize, compress, embed, discard. Information dies at every step. The user's exact phrasing โ€” gone. The nuance of a disagreement โ€” flattened. The specific numbers discussed three sessions ago โ€” vanished into a centroid.

COMB takes the opposite approach: keep everything, find what matters.

Three lines to remember everything. Zero dependencies. Pure Python.

store = CombStore("./memory")
store.stage("the full conversation")
store.rollup()

Quick Start

Installation

pip install comb-db
# With CLI support
pip install comb-db[cli]

Requirements: Python 3.10+ ยท Zero runtime dependencies (stdlib only)

Usage

from comb import CombStore

# Point at a directory. That's your entire database.
store = CombStore("./my-memory")

# Stage conversations throughout the day (Tier 2 โ€” append-only)
store.stage("User asked about RSA key exchange. Explained PKCS#1 v2.1...")
store.stage("Follow-up: user wants Ed25519 instead. Discussed tradeoffs.")

# Roll up into the permanent archive (Tier 2 โ†’ Tier 3)
doc = store.rollup()
# โ†’ SHA-256 hash-chained, semantic + social links computed automatically

# Search
results = store.search("encryption")
for r in results:
    print(f"{r.date}  score={r.similarity_score:.4f}")

# Navigate the honeycomb graph
doc = store.get("2026-02-17")
print(doc.temporal.prev)           # previous day
print(doc.temporal.next)           # next day
print(doc.semantic.neighbors)      # similar conversations
print(doc.social.strengthened)     # deepening relationships
print(doc.social.cooled)           # cooling relationships

# Walk the graph
for d in store.walk("2026-02-01", direction="semantic", depth=10):
    print(d.date, d.content[:80])

# Verify chain integrity โ€” any tampering breaks the chain
assert store.verify_chain()

# Stats
print(store.stats())
# โ†’ {"document_count": 42, "chain_valid": True, "semantic_links": 180, ...}

Why COMB

Approach What you lose
๐Ÿ—œ๏ธ Summarization Exact phrasing, detail, nuance
๐Ÿงฎ Vector embeddings Keyword precision, temporal order
๐Ÿ“‹ Sliding window Everything outside the window
๐Ÿ COMB Nothing
Feature Detail
๐Ÿ”’ Lossless Full conversation text, always recoverable
โ›“๏ธ Hash-chained SHA-256 tamper-evident chain โ€” blockchain-grade integrity
๐Ÿ Three-directional links Navigate by time, by meaning, or by relationship
๐Ÿ” Built-in BM25 Full-text search, zero dependencies, pluggable backend
๐Ÿ“ Serverless No database, no server โ€” just JSON files in a directory
๐Ÿ“ Schema-on-read Your data, your interpretation
๐Ÿชถ Lightweight ~1,300 lines of pure Python. Nothing to configure.

The Honeycomb

Every archived document lives in a three-directional graph:

         TEMPORAL โ†โ”€โ”€โ†’  chronological chain (prev/next, hash-linked)
         SEMANTIC โ†โ”€โ”€โ†’  content similarity (TF cosine, top-k neighbors)
         SOCIAL   โ†โ”€โ”€โ†’  relationship gradient (warming โ†” cooling)

โ›“๏ธ Temporal Links

A doubly-linked chronological chain. Each document points to the previous and next day. Hash-linked via SHA-256 โ€” if any document is modified, all subsequent hashes become invalid.

doc = store.get("2026-02-17")
doc.temporal.prev   # "2026-02-16"
doc.temporal.next   # "2026-02-18"
doc.hash            # "a3f8c2..."  (SHA-256 of prev_hash + content)
doc.prev_hash       # "7d1b9e..."  (previous document's hash)

๐Ÿง  Semantic Links

Cosine similarity over term-frequency vectors, computed automatically during rollup. The top-k most similar documents are linked as neighbors.

for neighbor in doc.semantic.neighbors:
    print(f"{neighbor.target}: {neighbor.score:.4f}")
# 2026-02-10: 0.8721   โ† discussed same topic 7 days ago
# 2026-02-03: 0.6534   โ† related conversation 2 weeks ago

๐Ÿ’› Social Links

The novel part. Conversations have relational temperature. COMB tracks engagement patterns and sentiment shifts between documents:

  • Inward fade (strengthening) โ€” engagement increasing, sentiment warming
  • Outward fade (cooling) โ€” engagement decreasing, sentiment cooling
for link in doc.social.strengthened:
    print(f"โ†— {link.target}: +{link.delta:.4f}")

for link in doc.social.cooled:
    print(f"โ†˜ {link.target}: {link.delta:.4f}")

This lets an agent understand not just what was discussed, but how the relationship evolved.

The Blink Pattern

Seamless agent restarts with zero context loss. The agent doesn't die and resurrect โ€” it blinks.

# โ”€โ”€โ”€ Before restart: flush everything โ”€โ”€โ”€
store.blink("""
Active project: step 3500/100K, loss 4.85
Decision: deferred publish until 80% milestone
Pending: PR #42 review, cron job for monitoring
""")

# โ”€โ”€โ”€ After restart: wake up where you left off โ”€โ”€โ”€
context = store.recall()
# โ†’ staged entries (most recent first) + archived history

The blink() method stages operational context with a "blink": True metadata flag. The recall() method reconstructs it โ€” staged entries first, then recent archive documents.

# Blink with auto-rollup into permanent archive
store.blink(context, rollup=True)

# Recall with custom depth
context = store.recall(k=10, include_staged=True)

Not death and resurrection. Just a blink. See docs/blink.md for the full pattern.

API Reference

CombStore(path, *, search_backend=None)

Create or open a COMB store.

Parameter Type Description
path str | Path Directory for all COMB data. Created if missing.
search_backend SearchBackend | None Custom search backend. Defaults to built-in BM25.

Staging (Tier 2)

store.stage(text, *, metadata=None, date=None)

Append a conversation dump to today's staging file.

store.rollup(date=None) โ†’ CombDocument | None

Promote staged data to the chain archive. Concatenates all staged entries, computes hash chain + honeycomb links, indexes for search, clears staging.

Search

store.search(query, *, mode="bm25", k=5) โ†’ list[CombDocument]

Search the archive. Returns documents with similarity_score set.

Mode Description
"bm25" Built-in BM25 full-text search (default)
"semantic" Requires custom SearchBackend
"hybrid" Requires custom SearchBackend

Navigation

store.get(date) โ†’ CombDocument | None

Retrieve an archived document by date string (YYYY-MM-DD).

store.walk(start, *, direction="temporal", depth=100) โ†’ Iterator[CombDocument]

Walk the honeycomb graph from a starting date.

Direction Behavior
"temporal" Follow prev/next chain forward
"semantic" Breadth-first through semantic neighbors

Blink

store.blink(text, *, metadata=None, rollup=False) โ†’ str

Stage operational context and return recall preview. The pre-restart half.

store.recall(*, k=5, include_staged=True) โ†’ str

Reconstruct operational context. The post-restart half.

Integrity

store.verify_chain() โ†’ bool

Verify the entire SHA-256 hash chain. Returns True if intact.

store.stats() โ†’ dict

Returns document_count, total_bytes, chain_length, semantic_links, social_links, staged_dates, chain_valid.

CombDocument

Field Type Description
date str Date string (YYYY-MM-DD)
content str Full archived text
hash str SHA-256 chain hash
prev_hash str | None Previous document's hash
metadata dict Arbitrary metadata (byte_size, total_tokens, created_at, ...)
temporal TemporalLinks .prev, .next โ€” date strings
semantic SemanticLinks .neighbors โ€” list of SemanticNeighbor(target, score)
social SocialLinks .links, .strengthened, .cooled โ€” list of SocialLink(target, direction, delta)
similarity_score float | None Set during search (transient, not persisted)

SearchBackend Protocol

Plug in any search engine:

from comb import SearchBackend

class MyVectorSearch:
    def index(self, doc_id: str, text: str) -> None:
        # Index a document (doc_id is typically a date string)
        ...

    def search(self, query: str, k: int = 5) -> list[tuple[str, float]]:
        # Return [(doc_id, score), ...] sorted by descending relevance
        ...

store = CombStore("./memory", search_backend=MyVectorSearch())

CLI

Requires pip install comb-db[cli] (adds click dependency).

# Stage a conversation
echo "Today's conversation..." | comb -s ./memory stage
comb -s ./memory stage -f transcript.txt
comb -s ./memory stage "inline text"

# Roll up into archive
comb -s ./memory rollup

# Search
comb -s ./memory search "encryption"

# Blink pattern
echo "operational context" | comb -s ./memory blink
comb -s ./memory blink -f context.md --rollup
comb -s ./memory recall

# Inspect
comb -s ./memory show 2026-02-17
comb -s ./memory stats
comb -s ./memory verify

The COMB_STORE environment variable can replace -s:

export COMB_STORE=./memory
comb stage "text"
comb search "query"

Architecture

                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
               โ•ฑโ•ฒ   โ”‚ Tier 1  โ”‚   Agent's context window
              โ•ฑ  โ•ฒ  โ”‚ Active  โ”‚   (not managed by COMB)
             โ•ฑ    โ•ฒ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
            โ•ฑ      โ•ฒ
    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ•ฑโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฒโ”€โ”€โ”€โ”€โ”€โ”€โ”
    โ”‚      Tier 2          โ”‚   Append-only JSONL per day
    โ”‚   Daily Staging      โ”‚   staging/YYYY-MM-DD.jsonl
    โ”‚                      โ”‚
    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
               โ”‚ rollup()
    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
    โ”‚      Tier 3          โ”‚   One JSON document per day
    โ”‚   Chain Archive      โ”‚   SHA-256 hash-chained
    โ”‚   + Honeycomb Links  โ”‚   Temporal + Semantic + Social
    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Storage Layout

my-memory/
โ”œโ”€โ”€ staging/
โ”‚   โ””โ”€โ”€ 2026-03-02.jsonl      # today's staged entries (append-only)
โ””โ”€โ”€ archive/
    โ”œโ”€โ”€ 2026-02-28.json       # hash-chained document with honeycomb links
    โ”œโ”€โ”€ 2026-03-01.json
    โ””โ”€โ”€ 2026-03-02.json

Everything is JSON. Human-readable. No binary formats. Copy the directory, copy the memory.

Source Layout

comb/
โ”œโ”€โ”€ core.py          # CombStore โ€” main interface (333 lines)
โ”œโ”€โ”€ document.py      # CombDocument + link types (164 lines)
โ”œโ”€โ”€ archive.py       # ChainArchive โ€” hash-chained Tier 3 (171 lines)
โ”œโ”€โ”€ staging.py       # DailyStaging โ€” append-only Tier 2 (104 lines)
โ”œโ”€โ”€ honeycomb.py     # HoneycombGraph โ€” three-directional links (166 lines)
โ”œโ”€โ”€ search.py        # BM25Search + SearchBackend protocol (137 lines)
โ”œโ”€โ”€ cli.py           # Click CLI (155 lines)
โ””โ”€โ”€ _utils.py        # SHA-256, tokenizer, sentiment (68 lines)

~1,300 lines total. 45 tests.

What COMB Is โ€” and Isn't

Is:

  • A lossless archival system for AI conversation history
  • A tamper-evident hash chain with three-directional graph navigation
  • A zero-dependency library that runs anywhere Python runs
  • The memory persistence layer your agent is missing

Isn't:

  • Not a vector database (but you can plug one in)
  • Not a summarization tool (that's the point)
  • Not a real-time streaming system
  • Not a replacement for your context window โ€” it's the archive behind it

Lineage

COMB descends from HYBRIDbee, a serverless document database. Same philosophy: schema-on-read, single-directory storage, zero configuration.

Links

๐Ÿ“ฆ PyPI pypi.org/project/comb-db
๐ŸฆŠ Source gitlab.com/the-lab9951109/comb
๐Ÿค— HuggingFace huggingface.co/ava-shakil/comb
๐Ÿ“ Blog artifactvirtual.substack.com
๐Ÿ“„ Docs docs/

License

MIT โ€” do whatever you want.


Built by Ava Shakil at Artifact Virtual

commit. ๐Ÿ”ฎ

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support