Spaces:
Paused
Paused
File size: 9,241 Bytes
caea1dc | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 | ---
summary: "Research notes: offline memory system for Clawd workspaces (Markdown source-of-truth + derived index)"
read_when:
- Designing workspace memory (~/.openclaw/workspace) beyond daily Markdown logs
- Deciding: standalone CLI vs deep OpenClaw integration
- Adding offline recall + reflection (retain/recall/reflect)
title: "Workspace Memory Research"
---
# Workspace Memory v2 (offline): research notes
Target: Clawd-style workspace (`agents.defaults.workspace`, default `~/.openclaw/workspace`) where “memory” is stored as one Markdown file per day (`memory/YYYY-MM-DD.md`) plus a small set of stable files (e.g. `memory.md`, `SOUL.md`).
This doc proposes an **offline-first** memory architecture that keeps Markdown as the canonical, reviewable source of truth, but adds **structured recall** (search, entity summaries, confidence updates) via a derived index.
## Why change?
The current setup (one file per day) is excellent for:
- “append-only” journaling
- human editing
- git-backed durability + auditability
- low-friction capture (“just write it down”)
It’s weak for:
- high-recall retrieval (“what did we decide about X?”, “last time we tried Y?”)
- entity-centric answers (“tell me about Alice / The Castle / warelay”) without rereading many files
- opinion/preference stability (and evidence when it changes)
- time constraints (“what was true during Nov 2025?”) and conflict resolution
## Design goals
- **Offline**: works without network; can run on laptop/Castle; no cloud dependency.
- **Explainable**: retrieved items should be attributable (file + location) and separable from inference.
- **Low ceremony**: daily logging stays Markdown, no heavy schema work.
- **Incremental**: v1 is useful with FTS only; semantic/vector and graphs are optional upgrades.
- **Agent-friendly**: makes “recall within token budgets” easy (return small bundles of facts).
## North star model (Hindsight × Letta)
Two pieces to blend:
1. **Letta/MemGPT-style control loop**
- keep a small “core” always in context (persona + key user facts)
- everything else is out-of-context and retrieved via tools
- memory writes are explicit tool calls (append/replace/insert), persisted, then re-injected next turn
2. **Hindsight-style memory substrate**
- separate what’s observed vs what’s believed vs what’s summarized
- support retain/recall/reflect
- confidence-bearing opinions that can evolve with evidence
- entity-aware retrieval + temporal queries (even without full knowledge graphs)
## Proposed architecture (Markdown source-of-truth + derived index)
### Canonical store (git-friendly)
Keep `~/.openclaw/workspace` as canonical human-readable memory.
Suggested workspace layout:
```
~/.openclaw/workspace/
memory.md # small: durable facts + preferences (core-ish)
memory/
YYYY-MM-DD.md # daily log (append; narrative)
bank/ # “typed” memory pages (stable, reviewable)
world.md # objective facts about the world
experience.md # what the agent did (first-person)
opinions.md # subjective prefs/judgments + confidence + evidence pointers
entities/
Peter.md
The-Castle.md
warelay.md
...
```
Notes:
- **Daily log stays daily log**. No need to turn it into JSON.
- The `bank/` files are **curated**, produced by reflection jobs, and can still be edited by hand.
- `memory.md` remains “small + core-ish”: the things you want Clawd to see every session.
### Derived store (machine recall)
Add a derived index under the workspace (not necessarily git tracked):
```
~/.openclaw/workspace/.memory/index.sqlite
```
Back it with:
- SQLite schema for facts + entity links + opinion metadata
- SQLite **FTS5** for lexical recall (fast, tiny, offline)
- optional embeddings table for semantic recall (still offline)
The index is always **rebuildable from Markdown**.
## Retain / Recall / Reflect (operational loop)
### Retain: normalize daily logs into “facts”
Hindsight’s key insight that matters here: store **narrative, self-contained facts**, not tiny snippets.
Practical rule for `memory/YYYY-MM-DD.md`:
- at end of day (or during), add a `## Retain` section with 2–5 bullets that are:
- narrative (cross-turn context preserved)
- self-contained (standalone makes sense later)
- tagged with type + entity mentions
Example:
```
## Retain
- W @Peter: Currently in Marrakech (Nov 27–Dec 1, 2025) for Andy’s birthday.
- B @warelay: I fixed the Baileys WS crash by wrapping connection.update handlers in try/catch (see memory/2025-11-27.md).
- O(c=0.95) @Peter: Prefers concise replies (<1500 chars) on WhatsApp; long content goes into files.
```
Minimal parsing:
- Type prefix: `W` (world), `B` (experience/biographical), `O` (opinion), `S` (observation/summary; usually generated)
- Entities: `@Peter`, `@warelay`, etc (slugs map to `bank/entities/*.md`)
- Opinion confidence: `O(c=0.0..1.0)` optional
If you don’t want authors to think about it: the reflect job can infer these bullets from the rest of the log, but having an explicit `## Retain` section is the easiest “quality lever”.
### Recall: queries over the derived index
Recall should support:
- **lexical**: “find exact terms / names / commands” (FTS5)
- **entity**: “tell me about X” (entity pages + entity-linked facts)
- **temporal**: “what happened around Nov 27” / “since last week”
- **opinion**: “what does Peter prefer?” (with confidence + evidence)
Return format should be agent-friendly and cite sources:
- `kind` (`world|experience|opinion|observation`)
- `timestamp` (source day, or extracted time range if present)
- `entities` (`["Peter","warelay"]`)
- `content` (the narrative fact)
- `source` (`memory/2025-11-27.md#L12` etc)
### Reflect: produce stable pages + update beliefs
Reflection is a scheduled job (daily or heartbeat `ultrathink`) that:
- updates `bank/entities/*.md` from recent facts (entity summaries)
- updates `bank/opinions.md` confidence based on reinforcement/contradiction
- optionally proposes edits to `memory.md` (“core-ish” durable facts)
Opinion evolution (simple, explainable):
- each opinion has:
- statement
- confidence `c ∈ [0,1]`
- last_updated
- evidence links (supporting + contradicting fact IDs)
- when new facts arrive:
- find candidate opinions by entity overlap + similarity (FTS first, embeddings later)
- update confidence by small deltas; big jumps require strong contradiction + repeated evidence
## CLI integration: standalone vs deep integration
Recommendation: **deep integration in OpenClaw**, but keep a separable core library.
### Why integrate into OpenClaw?
- OpenClaw already knows:
- the workspace path (`agents.defaults.workspace`)
- the session model + heartbeats
- logging + troubleshooting patterns
- You want the agent itself to call the tools:
- `openclaw memory recall "…" --k 25 --since 30d`
- `openclaw memory reflect --since 7d`
### Why still split a library?
- keep memory logic testable without gateway/runtime
- reuse from other contexts (local scripts, future desktop app, etc.)
Shape:
The memory tooling is intended to be a small CLI + library layer, but this is exploratory only.
## “S-Collide” / SuCo: when to use it (research)
If “S-Collide” refers to **SuCo (Subspace Collision)**: it’s an ANN retrieval approach that targets strong recall/latency tradeoffs by using learned/structured collisions in subspaces (paper: arXiv 2411.14754, 2024).
Pragmatic take for `~/.openclaw/workspace`:
- **don’t start** with SuCo.
- start with SQLite FTS + (optional) simple embeddings; you’ll get most UX wins immediately.
- consider SuCo/HNSW/ScaNN-class solutions only once:
- corpus is big (tens/hundreds of thousands of chunks)
- brute-force embedding search becomes too slow
- recall quality is meaningfully bottlenecked by lexical search
Offline-friendly alternatives (in increasing complexity):
- SQLite FTS5 + metadata filters (zero ML)
- Embeddings + brute force (works surprisingly far if chunk count is low)
- HNSW index (common, robust; needs a library binding)
- SuCo (research-grade; attractive if there’s a solid implementation you can embed)
Open question:
- what’s the **best** offline embedding model for “personal assistant memory” on your machines (laptop + desktop)?
- if you already have Ollama: embed with a local model; otherwise ship a small embedding model in the toolchain.
## Smallest useful pilot
If you want a minimal, still-useful version:
- Add `bank/` entity pages and a `## Retain` section in daily logs.
- Use SQLite FTS for recall with citations (path + line numbers).
- Add embeddings only if recall quality or scale demands it.
## References
- Letta / MemGPT concepts: “core memory blocks” + “archival memory” + tool-driven self-editing memory.
- Hindsight Technical Report: “retain / recall / reflect”, four-network memory, narrative fact extraction, opinion confidence evolution.
- SuCo: arXiv 2411.14754 (2024): “Subspace Collision” approximate nearest neighbor retrieval.
|