| --- |
| title: "Memory" |
| summary: "How OpenClaw memory works (workspace files + automatic memory flush)" |
| read_when: |
| - You want the memory file layout and workflow |
| - You want to tune the automatic pre-compaction memory flush |
| --- |
| |
| # Memory |
|
|
| OpenClaw memory is **plain Markdown in the agent workspace**. The files are the |
| source of truth; the model only "remembers" what gets written to disk. |
|
|
| Memory search tools are provided by the active memory plugin (default: |
| `memory-core`). Disable memory plugins with `plugins.slots.memory = "none"`. |
|
|
| ## Memory files (Markdown) |
|
|
| The default workspace layout uses two memory layers: |
|
|
| - `memory/YYYY-MM-DD.md` |
| - Daily log (append-only). |
| - Read today + yesterday at session start. |
| - `MEMORY.md` (optional) |
| - Curated long-term memory. |
| - **Only load in the main, private session** (never in group contexts). |
|
|
| These files live under the workspace (`agents.defaults.workspace`, default |
| `~/.openclaw/workspace`). See [Agent workspace](/concepts/agent-workspace) for the full layout. |
|
|
| ## Memory tools |
|
|
| OpenClaw exposes two agent-facing tools for these Markdown files: |
|
|
| - `memory_search` — semantic recall over indexed snippets. |
| - `memory_get` — targeted read of a specific Markdown file/line range. |
|
|
| `memory_get` now **degrades gracefully when a file doesn't exist** (for example, |
| today's daily log before the first write). Both the builtin manager and the QMD |
| backend return `{ text: "", path }` instead of throwing `ENOENT`, so agents can |
| handle "nothing recorded yet" and continue their workflow without wrapping the |
| tool call in try/catch logic. |
|
|
| ## When to write memory |
|
|
| - Decisions, preferences, and durable facts go to `MEMORY.md`. |
| - Day-to-day notes and running context go to `memory/YYYY-MM-DD.md`. |
| - If someone says "remember this," write it down (do not keep it in RAM). |
| - This area is still evolving. It helps to remind the model to store memories; it will know what to do. |
| - If you want something to stick, **ask the bot to write it** into memory. |
|
|
| ## Automatic memory flush (pre-compaction ping) |
|
|
| When a session is **close to auto-compaction**, OpenClaw triggers a **silent, |
| agentic turn** that reminds the model to write durable memory **before** the |
| context is compacted. The default prompts explicitly say the model _may reply_, |
| but usually `NO_REPLY` is the correct response so the user never sees this turn. |
|
|
| This is controlled by `agents.defaults.compaction.memoryFlush`: |
|
|
| ```json5 |
| { |
| agents: { |
| defaults: { |
| compaction: { |
| reserveTokensFloor: 20000, |
| memoryFlush: { |
| enabled: true, |
| softThresholdTokens: 4000, |
| systemPrompt: "Session nearing compaction. Store durable memories now.", |
| prompt: "Write any lasting notes to memory/YYYY-MM-DD.md; reply with NO_REPLY if nothing to store.", |
| }, |
| }, |
| }, |
| }, |
| } |
| ``` |
|
|
| Details: |
|
|
| - **Soft threshold**: flush triggers when the session token estimate crosses |
| `contextWindow - reserveTokensFloor - softThresholdTokens`. |
| - **Silent** by default: prompts include `NO_REPLY` so nothing is delivered. |
| - **Two prompts**: a user prompt plus a system prompt append the reminder. |
| - **One flush per compaction cycle** (tracked in `sessions.json`). |
| - **Workspace must be writable**: if the session runs sandboxed with |
| `workspaceAccess: "ro"` or `"none"`, the flush is skipped. |
|
|
| For the full compaction lifecycle, see |
| [Session management + compaction](/reference/session-management-compaction). |
|
|
| ## Vector memory search |
|
|
| OpenClaw can build a small vector index over `MEMORY.md` and `memory/*.md` so |
| semantic queries can find related notes even when wording differs. |
|
|
| Defaults: |
|
|
| - Enabled by default. |
| - Watches memory files for changes (debounced). |
| - Configure memory search under `agents.defaults.memorySearch` (not top-level |
| `memorySearch`). |
| - Uses remote embeddings by default. If `memorySearch.provider` is not set, OpenClaw auto-selects: |
| 1. `local` if a `memorySearch.local.modelPath` is configured and the file exists. |
| 2. `openai` if an OpenAI key can be resolved. |
| 3. `gemini` if a Gemini key can be resolved. |
| 4. `voyage` if a Voyage key can be resolved. |
| 5. `mistral` if a Mistral key can be resolved. |
| 6. Otherwise memory search stays disabled until configured. |
| - Local mode uses node-llama-cpp and may require `pnpm approve-builds`. |
| - Uses sqlite-vec (when available) to accelerate vector search inside SQLite. |
| - `memorySearch.provider = "ollama"` is also supported for local/self-hosted |
| Ollama embeddings (`/api/embeddings`), but it is not auto-selected. |
|
|
| Remote embeddings **require** an API key for the embedding provider. OpenClaw |
| resolves keys from auth profiles, `models.providers.*.apiKey`, or environment |
| variables. Codex OAuth only covers chat/completions and does **not** satisfy |
| embeddings for memory search. For Gemini, use `GEMINI_API_KEY` or |
| `models.providers.google.apiKey`. For Voyage, use `VOYAGE_API_KEY` or |
| `models.providers.voyage.apiKey`. For Mistral, use `MISTRAL_API_KEY` or |
| `models.providers.mistral.apiKey`. Ollama typically does not require a real API |
| key (a placeholder like `OLLAMA_API_KEY=ollama-local` is enough when needed by |
| local policy). |
| When using a custom OpenAI-compatible endpoint, |
| set `memorySearch.remote.apiKey` (and optional `memorySearch.remote.headers`). |
|
|
| ### QMD backend (experimental) |
|
|
| Set `memory.backend = "qmd"` to swap the built-in SQLite indexer for |
| [QMD](https://github.com/tobi/qmd): a local-first search sidecar that combines |
| BM25 + vectors + reranking. Markdown stays the source of truth; OpenClaw shells |
| out to QMD for retrieval. Key points: |
|
|
| **Prereqs** |
|
|
| - Disabled by default. Opt in per-config (`memory.backend = "qmd"`). |
| - Install the QMD CLI separately (`bun install -g https://github.com/tobi/qmd` or grab |
| a release) and make sure the `qmd` binary is on the gateway’s `PATH`. |
| - QMD needs an SQLite build that allows extensions (`brew install sqlite` on |
| macOS). |
| - QMD runs fully locally via Bun + `node-llama-cpp` and auto-downloads GGUF |
| models from HuggingFace on first use (no separate Ollama daemon required). |
| - The gateway runs QMD in a self-contained XDG home under |
| `~/.openclaw/agents/<agentId>/qmd/` by setting `XDG_CONFIG_HOME` and |
| `XDG_CACHE_HOME`. |
| - OS support: macOS and Linux work out of the box once Bun + SQLite are |
| installed. Windows is best supported via WSL2. |
|
|
| **How the sidecar runs** |
|
|
| - The gateway writes a self-contained QMD home under |
| `~/.openclaw/agents/<agentId>/qmd/` (config + cache + sqlite DB). |
| - Collections are created via `qmd collection add` from `memory.qmd.paths` |
| (plus default workspace memory files), then `qmd update` + `qmd embed` run |
| on boot and on a configurable interval (`memory.qmd.update.interval`, |
| default 5 m). |
| - The gateway now initializes the QMD manager on startup, so periodic update |
| timers are armed even before the first `memory_search` call. |
| - Boot refresh now runs in the background by default so chat startup is not |
| blocked; set `memory.qmd.update.waitForBootSync = true` to keep the previous |
| blocking behavior. |
| - Searches run via `memory.qmd.searchMode` (default `qmd search --json`; also |
| supports `vsearch` and `query`). If the selected mode rejects flags on your |
| QMD build, OpenClaw retries with `qmd query`. If QMD fails or the binary is |
| missing, OpenClaw automatically falls back to the builtin SQLite manager so |
| memory tools keep working. |
| - OpenClaw does not expose QMD embed batch-size tuning today; batch behavior is |
| controlled by QMD itself. |
| - **First search may be slow**: QMD may download local GGUF models (reranker/query |
| expansion) on the first `qmd query` run. |
| - OpenClaw sets `XDG_CONFIG_HOME`/`XDG_CACHE_HOME` automatically when it runs QMD. |
| - If you want to pre-download models manually (and warm the same index OpenClaw |
| uses), run a one-off query with the agent’s XDG dirs. |
| |
| OpenClaw’s QMD state lives under your **state dir** (defaults to `~/.openclaw`). |
| You can point `qmd` at the exact same index by exporting the same XDG vars |
| OpenClaw uses: |
| |
| ```bash |
| # Pick the same state dir OpenClaw uses |
| STATE_DIR="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}" |
| |
| export XDG_CONFIG_HOME="$STATE_DIR/agents/main/qmd/xdg-config" |
| export XDG_CACHE_HOME="$STATE_DIR/agents/main/qmd/xdg-cache" |
| |
| # (Optional) force an index refresh + embeddings |
| qmd update |
| qmd embed |
| |
| # Warm up / trigger first-time model downloads |
| qmd query "test" -c memory-root --json >/dev/null 2>&1 |
| ``` |
| |
| **Config surface (`memory.qmd.*`)** |
| |
| - `command` (default `qmd`): override the executable path. |
| - `searchMode` (default `search`): pick which QMD command backs |
| `memory_search` (`search`, `vsearch`, `query`). |
| - `includeDefaultMemory` (default `true`): auto-index `MEMORY.md` + `memory/**/*.md`. |
| - `paths[]`: add extra directories/files (`path`, optional `pattern`, optional |
| stable `name`). |
| - `sessions`: opt into session JSONL indexing (`enabled`, `retentionDays`, |
| `exportDir`). |
| - `update`: controls refresh cadence and maintenance execution: |
| (`interval`, `debounceMs`, `onBoot`, `waitForBootSync`, `embedInterval`, |
| `commandTimeoutMs`, `updateTimeoutMs`, `embedTimeoutMs`). |
| - `limits`: clamp recall payload (`maxResults`, `maxSnippetChars`, |
| `maxInjectedChars`, `timeoutMs`). |
| - `scope`: same schema as [`session.sendPolicy`](/gateway/configuration#session). |
| Default is DM-only (`deny` all, `allow` direct chats); loosen it to surface QMD |
| hits in groups/channels. |
| - `match.keyPrefix` matches the **normalized** session key (lowercased, with any |
| leading `agent:<id>:` stripped). Example: `discord:channel:`. |
| - `match.rawKeyPrefix` matches the **raw** session key (lowercased), including |
| `agent:<id>:`. Example: `agent:main:discord:`. |
| - Legacy: `match.keyPrefix: "agent:..."` is still treated as a raw-key prefix, |
| but prefer `rawKeyPrefix` for clarity. |
| - When `scope` denies a search, OpenClaw logs a warning with the derived |
| `channel`/`chatType` so empty results are easier to debug. |
| - Snippets sourced outside the workspace show up as |
| `qmd/<collection>/<relative-path>` in `memory_search` results; `memory_get` |
| understands that prefix and reads from the configured QMD collection root. |
| - When `memory.qmd.sessions.enabled = true`, OpenClaw exports sanitized session |
| transcripts (User/Assistant turns) into a dedicated QMD collection under |
| `~/.openclaw/agents/<id>/qmd/sessions/`, so `memory_search` can recall recent |
| conversations without touching the builtin SQLite index. |
| - `memory_search` snippets now include a `Source: <path#line>` footer when |
| `memory.citations` is `auto`/`on`; set `memory.citations = "off"` to keep |
| the path metadata internal (the agent still receives the path for |
| `memory_get`, but the snippet text omits the footer and the system prompt |
| warns the agent not to cite it). |
| |
| **Example** |
|
|
| ```json5 |
| memory: { |
| backend: "qmd", |
| citations: "auto", |
| qmd: { |
| includeDefaultMemory: true, |
| update: { interval: "5m", debounceMs: 15000 }, |
| limits: { maxResults: 6, timeoutMs: 4000 }, |
| scope: { |
| default: "deny", |
| rules: [ |
| { action: "allow", match: { chatType: "direct" } }, |
| // Normalized session-key prefix (strips `agent:<id>:`). |
| { action: "deny", match: { keyPrefix: "discord:channel:" } }, |
| // Raw session-key prefix (includes `agent:<id>:`). |
| { action: "deny", match: { rawKeyPrefix: "agent:main:discord:" } }, |
| ] |
| }, |
| paths: [ |
| { name: "docs", path: "~/notes", pattern: "**/*.md" } |
| ] |
| } |
| } |
| ``` |
|
|
| **Citations & fallback** |
|
|
| - `memory.citations` applies regardless of backend (`auto`/`on`/`off`). |
| - When `qmd` runs, we tag `status().backend = "qmd"` so diagnostics show which |
| engine served the results. If the QMD subprocess exits or JSON output can’t be |
| parsed, the search manager logs a warning and returns the builtin provider |
| (existing Markdown embeddings) until QMD recovers. |
|
|
| ### Additional memory paths |
|
|
| If you want to index Markdown files outside the default workspace layout, add |
| explicit paths: |
|
|
| ```json5 |
| agents: { |
| defaults: { |
| memorySearch: { |
| extraPaths: ["../team-docs", "/srv/shared-notes/overview.md"] |
| } |
| } |
| } |
| ``` |
|
|
| Notes: |
|
|
| - Paths can be absolute or workspace-relative. |
| - Directories are scanned recursively for `.md` files. |
| - By default, only Markdown files are indexed. |
| - If `memorySearch.multimodal.enabled = true`, OpenClaw also indexes supported image/audio files under `extraPaths` only. Default memory roots (`MEMORY.md`, `memory.md`, `memory/**/*.md`) stay Markdown-only. |
| - Symlinks are ignored (files or directories). |
|
|
| ### Multimodal memory files (Gemini image + audio) |
|
|
| OpenClaw can index image and audio files from `memorySearch.extraPaths` when using Gemini embedding 2: |
|
|
| ```json5 |
| agents: { |
| defaults: { |
| memorySearch: { |
| provider: "gemini", |
| model: "gemini-embedding-2-preview", |
| extraPaths: ["assets/reference", "voice-notes"], |
| multimodal: { |
| enabled: true, |
| modalities: ["image", "audio"], // or ["all"] |
| maxFileBytes: 10000000 |
| }, |
| remote: { |
| apiKey: "YOUR_GEMINI_API_KEY" |
| } |
| } |
| } |
| } |
| ``` |
|
|
| Notes: |
|
|
| - Multimodal memory is currently supported only for `gemini-embedding-2-preview`. |
| - Multimodal indexing applies only to files discovered through `memorySearch.extraPaths`. |
| - Supported modalities in this phase: image and audio. |
| - `memorySearch.fallback` must stay `"none"` while multimodal memory is enabled. |
| - Matching image/audio file bytes are uploaded to the configured Gemini embedding endpoint during indexing. |
| - Supported image extensions: `.jpg`, `.jpeg`, `.png`, `.webp`, `.gif`, `.heic`, `.heif`. |
| - Supported audio extensions: `.mp3`, `.wav`, `.ogg`, `.opus`, `.m4a`, `.aac`, `.flac`. |
| - Search queries remain text, but Gemini can compare those text queries against indexed image/audio embeddings. |
| - `memory_get` still reads Markdown only; binary files are searchable but not returned as raw file contents. |
|
|
| ### Gemini embeddings (native) |
|
|
| Set the provider to `gemini` to use the Gemini embeddings API directly: |
|
|
| ```json5 |
| agents: { |
| defaults: { |
| memorySearch: { |
| provider: "gemini", |
| model: "gemini-embedding-001", |
| remote: { |
| apiKey: "YOUR_GEMINI_API_KEY" |
| } |
| } |
| } |
| } |
| ``` |
|
|
| Notes: |
|
|
| - `remote.baseUrl` is optional (defaults to the Gemini API base URL). |
| - `remote.headers` lets you add extra headers if needed. |
| - Default model: `gemini-embedding-001`. |
| - `gemini-embedding-2-preview` is also supported: 8192 token limit and configurable dimensions (768 / 1536 / 3072, default 3072). |
|
|
| #### Gemini Embedding 2 (preview) |
|
|
| ```json5 |
| agents: { |
| defaults: { |
| memorySearch: { |
| provider: "gemini", |
| model: "gemini-embedding-2-preview", |
| outputDimensionality: 3072, // optional: 768, 1536, or 3072 (default) |
| remote: { |
| apiKey: "YOUR_GEMINI_API_KEY" |
| } |
| } |
| } |
| } |
| ``` |
|
|
| > **⚠️ Re-index required:** Switching from `gemini-embedding-001` (768 dimensions) |
| > to `gemini-embedding-2-preview` (3072 dimensions) changes the vector size. The same is true if you |
| > change `outputDimensionality` between 768, 1536, and 3072. |
| > OpenClaw will automatically reindex when it detects a model or dimension change. |
|
|
| If you want to use a **custom OpenAI-compatible endpoint** (OpenRouter, vLLM, or a proxy), |
| you can use the `remote` configuration with the OpenAI provider: |
|
|
| ```json5 |
| agents: { |
| defaults: { |
| memorySearch: { |
| provider: "openai", |
| model: "text-embedding-3-small", |
| remote: { |
| baseUrl: "https://api.example.com/v1/", |
| apiKey: "YOUR_OPENAI_COMPAT_API_KEY", |
| headers: { "X-Custom-Header": "value" } |
| } |
| } |
| } |
| } |
| ``` |
|
|
| If you don't want to set an API key, use `memorySearch.provider = "local"` or set |
| `memorySearch.fallback = "none"`. |
|
|
| Fallbacks: |
|
|
| - `memorySearch.fallback` can be `openai`, `gemini`, `voyage`, `mistral`, `ollama`, `local`, or `none`. |
| - The fallback provider is only used when the primary embedding provider fails. |
|
|
| Batch indexing (OpenAI + Gemini + Voyage): |
|
|
| - Disabled by default. Set `agents.defaults.memorySearch.remote.batch.enabled = true` to enable for large-corpus indexing (OpenAI, Gemini, and Voyage). |
| - Default behavior waits for batch completion; tune `remote.batch.wait`, `remote.batch.pollIntervalMs`, and `remote.batch.timeoutMinutes` if needed. |
| - Set `remote.batch.concurrency` to control how many batch jobs we submit in parallel (default: 2). |
| - Batch mode applies when `memorySearch.provider = "openai"` or `"gemini"` and uses the corresponding API key. |
| - Gemini batch jobs use the async embeddings batch endpoint and require Gemini Batch API availability. |
|
|
| Why OpenAI batch is fast + cheap: |
|
|
| - For large backfills, OpenAI is typically the fastest option we support because we can submit many embedding requests in a single batch job and let OpenAI process them asynchronously. |
| - OpenAI offers discounted pricing for Batch API workloads, so large indexing runs are usually cheaper than sending the same requests synchronously. |
| - See the OpenAI Batch API docs and pricing for details: |
| - [https://platform.openai.com/docs/api-reference/batch](https://platform.openai.com/docs/api-reference/batch) |
| - [https://platform.openai.com/pricing](https://platform.openai.com/pricing) |
|
|
| Config example: |
|
|
| ```json5 |
| agents: { |
| defaults: { |
| memorySearch: { |
| provider: "openai", |
| model: "text-embedding-3-small", |
| fallback: "openai", |
| remote: { |
| batch: { enabled: true, concurrency: 2 } |
| }, |
| sync: { watch: true } |
| } |
| } |
| } |
| ``` |
|
|
| Tools: |
|
|
| - `memory_search` — returns snippets with file + line ranges. |
| - `memory_get` — read memory file content by path. |
|
|
| Local mode: |
|
|
| - Set `agents.defaults.memorySearch.provider = "local"`. |
| - Provide `agents.defaults.memorySearch.local.modelPath` (GGUF or `hf:` URI). |
| - Optional: set `agents.defaults.memorySearch.fallback = "none"` to avoid remote fallback. |
|
|
| ### How the memory tools work |
|
|
| - `memory_search` semantically searches Markdown chunks (~400 token target, 80-token overlap) from `MEMORY.md` + `memory/**/*.md`. It returns snippet text (capped ~700 chars), file path, line range, score, provider/model, and whether we fell back from local → remote embeddings. No full file payload is returned. |
| - `memory_get` reads a specific memory Markdown file (workspace-relative), optionally from a starting line and for N lines. Paths outside `MEMORY.md` / `memory/` are rejected. |
| - Both tools are enabled only when `memorySearch.enabled` resolves true for the agent. |
|
|
| ### What gets indexed (and when) |
|
|
| - File type: Markdown only (`MEMORY.md`, `memory/**/*.md`). |
| - Index storage: per-agent SQLite at `~/.openclaw/memory/<agentId>.sqlite` (configurable via `agents.defaults.memorySearch.store.path`, supports `{agentId}` token). |
| - Freshness: watcher on `MEMORY.md` + `memory/` marks the index dirty (debounce 1.5s). Sync is scheduled on session start, on search, or on an interval and runs asynchronously. Session transcripts use delta thresholds to trigger background sync. |
| - Reindex triggers: the index stores the embedding **provider/model + endpoint fingerprint + chunking params**. If any of those change, OpenClaw automatically resets and reindexes the entire store. |
|
|
| ### Hybrid search (BM25 + vector) |
|
|
| When enabled, OpenClaw combines: |
|
|
| - **Vector similarity** (semantic match, wording can differ) |
| - **BM25 keyword relevance** (exact tokens like IDs, env vars, code symbols) |
|
|
| If full-text search is unavailable on your platform, OpenClaw falls back to vector-only search. |
|
|
| #### Why hybrid? |
|
|
| Vector search is great at “this means the same thing”: |
|
|
| - “Mac Studio gateway host” vs “the machine running the gateway” |
| - “debounce file updates” vs “avoid indexing on every write” |
|
|
| But it can be weak at exact, high-signal tokens: |
|
|
| - IDs (`a828e60`, `b3b9895a…`) |
| - code symbols (`memorySearch.query.hybrid`) |
| - error strings ("sqlite-vec unavailable") |
|
|
| BM25 (full-text) is the opposite: strong at exact tokens, weaker at paraphrases. |
| Hybrid search is the pragmatic middle ground: **use both retrieval signals** so you get |
| good results for both "natural language" queries and "needle in a haystack" queries. |
|
|
| #### How we merge results (the current design) |
|
|
| Implementation sketch: |
|
|
| 1. Retrieve a candidate pool from both sides: |
|
|
| - **Vector**: top `maxResults * candidateMultiplier` by cosine similarity. |
| - **BM25**: top `maxResults * candidateMultiplier` by FTS5 BM25 rank (lower is better). |
|
|
| 2. Convert BM25 rank into a 0..1-ish score: |
|
|
| - `textScore = 1 / (1 + max(0, bm25Rank))` |
|
|
| 3. Union candidates by chunk id and compute a weighted score: |
|
|
| - `finalScore = vectorWeight * vectorScore + textWeight * textScore` |
|
|
| Notes: |
|
|
| - `vectorWeight` + `textWeight` is normalized to 1.0 in config resolution, so weights behave as percentages. |
| - If embeddings are unavailable (or the provider returns a zero-vector), we still run BM25 and return keyword matches. |
| - If FTS5 can't be created, we keep vector-only search (no hard failure). |
|
|
| This isn't "IR-theory perfect", but it's simple, fast, and tends to improve recall/precision on real notes. |
| If we want to get fancier later, common next steps are Reciprocal Rank Fusion (RRF) or score normalization |
| (min/max or z-score) before mixing. |
|
|
| #### Post-processing pipeline |
|
|
| After merging vector and keyword scores, two optional post-processing stages |
| refine the result list before it reaches the agent: |
|
|
| ``` |
| Vector + Keyword → Weighted Merge → Temporal Decay → Sort → MMR → Top-K Results |
| ``` |
|
|
| Both stages are **off by default** and can be enabled independently. |
|
|
| #### MMR re-ranking (diversity) |
|
|
| When hybrid search returns results, multiple chunks may contain similar or overlapping content. |
| For example, searching for "home network setup" might return five nearly identical snippets |
| from different daily notes that all mention the same router configuration. |
|
|
| **MMR (Maximal Marginal Relevance)** re-ranks the results to balance relevance with diversity, |
| ensuring the top results cover different aspects of the query instead of repeating the same information. |
|
|
| How it works: |
|
|
| 1. Results are scored by their original relevance (vector + BM25 weighted score). |
| 2. MMR iteratively selects results that maximize: `λ × relevance − (1−λ) × max_similarity_to_selected`. |
| 3. Similarity between results is measured using Jaccard text similarity on tokenized content. |
|
|
| The `lambda` parameter controls the trade-off: |
|
|
| - `lambda = 1.0` → pure relevance (no diversity penalty) |
| - `lambda = 0.0` → maximum diversity (ignores relevance) |
| - Default: `0.7` (balanced, slight relevance bias) |
|
|
| **Example — query: "home network setup"** |
|
|
| Given these memory files: |
|
|
| ``` |
| memory/2026-02-10.md → "Configured Omada router, set VLAN 10 for IoT devices" |
| memory/2026-02-08.md → "Configured Omada router, moved IoT to VLAN 10" |
| memory/2026-02-05.md → "Set up AdGuard DNS on 192.168.10.2" |
| memory/network.md → "Router: Omada ER605, AdGuard: 192.168.10.2, VLAN 10: IoT" |
| ``` |
|
|
| Without MMR — top 3 results: |
|
|
| ``` |
| 1. memory/2026-02-10.md (score: 0.92) ← router + VLAN |
| 2. memory/2026-02-08.md (score: 0.89) ← router + VLAN (near-duplicate!) |
| 3. memory/network.md (score: 0.85) ← reference doc |
| ``` |
|
|
| With MMR (λ=0.7) — top 3 results: |
|
|
| ``` |
| 1. memory/2026-02-10.md (score: 0.92) ← router + VLAN |
| 2. memory/network.md (score: 0.85) ← reference doc (diverse!) |
| 3. memory/2026-02-05.md (score: 0.78) ← AdGuard DNS (diverse!) |
| ``` |
|
|
| The near-duplicate from Feb 8 drops out, and the agent gets three distinct pieces of information. |
|
|
| **When to enable:** If you notice `memory_search` returning redundant or near-duplicate snippets, |
| especially with daily notes that often repeat similar information across days. |
|
|
| #### Temporal decay (recency boost) |
|
|
| Agents with daily notes accumulate hundreds of dated files over time. Without decay, |
| a well-worded note from six months ago can outrank yesterday's update on the same topic. |
|
|
| **Temporal decay** applies an exponential multiplier to scores based on the age of each result, |
| so recent memories naturally rank higher while old ones fade: |
|
|
| ``` |
| decayedScore = score × e^(-λ × ageInDays) |
| ``` |
|
|
| where `λ = ln(2) / halfLifeDays`. |
|
|
| With the default half-life of 30 days: |
|
|
| - Today's notes: **100%** of original score |
| - 7 days ago: **~84%** |
| - 30 days ago: **50%** |
| - 90 days ago: **12.5%** |
| - 180 days ago: **~1.6%** |
|
|
| **Evergreen files are never decayed:** |
|
|
| - `MEMORY.md` (root memory file) |
| - Non-dated files in `memory/` (e.g., `memory/projects.md`, `memory/network.md`) |
| - These contain durable reference information that should always rank normally. |
|
|
| **Dated daily files** (`memory/YYYY-MM-DD.md`) use the date extracted from the filename. |
| Other sources (e.g., session transcripts) fall back to file modification time (`mtime`). |
|
|
| **Example — query: "what's Rod's work schedule?"** |
|
|
| Given these memory files (today is Feb 10): |
|
|
| ``` |
| memory/2025-09-15.md → "Rod works Mon-Fri, standup at 10am, pairing at 2pm" (148 days old) |
| memory/2026-02-10.md → "Rod has standup at 14:15, 1:1 with Zeb at 14:45" (today) |
| memory/2026-02-03.md → "Rod started new team, standup moved to 14:15" (7 days old) |
| ``` |
|
|
| Without decay: |
|
|
| ``` |
| 1. memory/2025-09-15.md (score: 0.91) ← best semantic match, but stale! |
| 2. memory/2026-02-10.md (score: 0.82) |
| 3. memory/2026-02-03.md (score: 0.80) |
| ``` |
|
|
| With decay (halfLife=30): |
|
|
| ``` |
| 1. memory/2026-02-10.md (score: 0.82 × 1.00 = 0.82) ← today, no decay |
| 2. memory/2026-02-03.md (score: 0.80 × 0.85 = 0.68) ← 7 days, mild decay |
| 3. memory/2025-09-15.md (score: 0.91 × 0.03 = 0.03) ← 148 days, nearly gone |
| ``` |
|
|
| The stale September note drops to the bottom despite having the best raw semantic match. |
|
|
| **When to enable:** If your agent has months of daily notes and you find that old, |
| stale information outranks recent context. A half-life of 30 days works well for |
| daily-note-heavy workflows; increase it (e.g., 90 days) if you reference older notes frequently. |
|
|
| #### Configuration |
|
|
| Both features are configured under `memorySearch.query.hybrid`: |
|
|
| ```json5 |
| agents: { |
| defaults: { |
| memorySearch: { |
| query: { |
| hybrid: { |
| enabled: true, |
| vectorWeight: 0.7, |
| textWeight: 0.3, |
| candidateMultiplier: 4, |
| // Diversity: reduce redundant results |
| mmr: { |
| enabled: true, // default: false |
| lambda: 0.7 // 0 = max diversity, 1 = max relevance |
| }, |
| // Recency: boost newer memories |
| temporalDecay: { |
| enabled: true, // default: false |
| halfLifeDays: 30 // score halves every 30 days |
| } |
| } |
| } |
| } |
| } |
| } |
| ``` |
|
|
| You can enable either feature independently: |
|
|
| - **MMR only** — useful when you have many similar notes but age doesn't matter. |
| - **Temporal decay only** — useful when recency matters but your results are already diverse. |
| - **Both** — recommended for agents with large, long-running daily note histories. |
|
|
| ### Embedding cache |
|
|
| OpenClaw can cache **chunk embeddings** in SQLite so reindexing and frequent updates (especially session transcripts) don't re-embed unchanged text. |
|
|
| Config: |
|
|
| ```json5 |
| agents: { |
| defaults: { |
| memorySearch: { |
| cache: { |
| enabled: true, |
| maxEntries: 50000 |
| } |
| } |
| } |
| } |
| ``` |
|
|
| ### Session memory search (experimental) |
|
|
| You can optionally index **session transcripts** and surface them via `memory_search`. |
| This is gated behind an experimental flag. |
|
|
| ```json5 |
| agents: { |
| defaults: { |
| memorySearch: { |
| experimental: { sessionMemory: true }, |
| sources: ["memory", "sessions"] |
| } |
| } |
| } |
| ``` |
|
|
| Notes: |
|
|
| - Session indexing is **opt-in** (off by default). |
| - Session updates are debounced and **indexed asynchronously** once they cross delta thresholds (best-effort). |
| - `memory_search` never blocks on indexing; results can be slightly stale until background sync finishes. |
| - Results still include snippets only; `memory_get` remains limited to memory files. |
| - Session indexing is isolated per agent (only that agent’s session logs are indexed). |
| - Session logs live on disk (`~/.openclaw/agents/<agentId>/sessions/*.jsonl`). Any process/user with filesystem access can read them, so treat disk access as the trust boundary. For stricter isolation, run agents under separate OS users or hosts. |
|
|
| Delta thresholds (defaults shown): |
|
|
| ```json5 |
| agents: { |
| defaults: { |
| memorySearch: { |
| sync: { |
| sessions: { |
| deltaBytes: 100000, // ~100 KB |
| deltaMessages: 50 // JSONL lines |
| } |
| } |
| } |
| } |
| } |
| ``` |
|
|
| ### SQLite vector acceleration (sqlite-vec) |
|
|
| When the sqlite-vec extension is available, OpenClaw stores embeddings in a |
| SQLite virtual table (`vec0`) and performs vector distance queries in the |
| database. This keeps search fast without loading every embedding into JS. |
|
|
| Configuration (optional): |
|
|
| ```json5 |
| agents: { |
| defaults: { |
| memorySearch: { |
| store: { |
| vector: { |
| enabled: true, |
| extensionPath: "/path/to/sqlite-vec" |
| } |
| } |
| } |
| } |
| } |
| ``` |
|
|
| Notes: |
|
|
| - `enabled` defaults to true; when disabled, search falls back to in-process |
| cosine similarity over stored embeddings. |
| - If the sqlite-vec extension is missing or fails to load, OpenClaw logs the |
| error and continues with the JS fallback (no vector table). |
| - `extensionPath` overrides the bundled sqlite-vec path (useful for custom builds |
| or non-standard install locations). |
|
|
| ### Local embedding auto-download |
|
|
| - Default local embedding model: `hf:ggml-org/embeddinggemma-300m-qat-q8_0-GGUF/embeddinggemma-300m-qat-Q8_0.gguf` (~0.6 GB). |
| - When `memorySearch.provider = "local"`, `node-llama-cpp` resolves `modelPath`; if the GGUF is missing it **auto-downloads** to the cache (or `local.modelCacheDir` if set), then loads it. Downloads resume on retry. |
| - Native build requirement: run `pnpm approve-builds`, pick `node-llama-cpp`, then `pnpm rebuild node-llama-cpp`. |
| - Fallback: if local setup fails and `memorySearch.fallback = "openai"`, we automatically switch to remote embeddings (`openai/text-embedding-3-small` unless overridden) and record the reason. |
|
|
| ### Custom OpenAI-compatible endpoint example |
|
|
| ```json5 |
| agents: { |
| defaults: { |
| memorySearch: { |
| provider: "openai", |
| model: "text-embedding-3-small", |
| remote: { |
| baseUrl: "https://api.example.com/v1/", |
| apiKey: "YOUR_REMOTE_API_KEY", |
| headers: { |
| "X-Organization": "org-id", |
| "X-Project": "project-id" |
| } |
| } |
| } |
| } |
| } |
| ``` |
|
|
| Notes: |
|
|
| - `remote.*` takes precedence over `models.providers.openai.*`. |
| - `remote.headers` merge with OpenAI headers; remote wins on key conflicts. Omit `remote.headers` to use the OpenAI defaults. |
|
|