MiniSearch / docs /conversation-memory.md
github-actions[bot]
Sync from https://github.com/felladrin/MiniSearch
10d1fd4
---
# Conversation Memory System
## Purpose
Long-running chats can easily exceed the model context window. MiniSearch addresses this by keeping a rolling, extractive summary of prior turns and only feeding the freshest messages into the model alongside that summary. All context handling happens locally in the browser to preserve privacy. @client/modules/textGeneration.ts#262-370
## Components
1. **Token Budgeting**`generateChatResponse` measures the system prompt and a stub "Ok!" assistant reply, then caps the rest of the user/assistant turns at 75% of the default 4096-token window (≈3072 tokens) to leave headroom for the response. A GPT tokenizer keeps count per message before inclusion. @client/modules/textGeneration.ts#262-303 @client/modules/textGenerationUtilities.ts#13-74
2. **Rolling Summary Storage** – The latest summary plus a conversation identifier live in a lightweight pub/sub store so any component can read/write without prop drilling. @client/modules/pubSub.ts#249-268
3. **Summarization Engine** – When older turns must be dropped, `createLlmSummary` asks the configured inference backend (OpenAI, AI Horde, internal API, WebLLM, or Wllama) to condense the removed messages under an 800-token limit. If the LLM call fails, the system falls back to an extractive tokenizer-based summarizer to guarantee progress. @client/modules/textGeneration.ts#66-177
4. **Persistence Hooks** – After a search run completes, `saveLlmResponseForQuery` stores the assistant reply in IndexedDB so history restores can reload it. The conversation summary itself stays in-memory and resets whenever a new search run begins. @client/modules/history.ts#288-333 @client/modules/textGeneration.ts#179-247
## Flow
1. User sends a chat message.
2. System prompt is regenerated by `getSystemPrompt` and augmented with any stored summary (`Conversation context: ...`). @client/modules/textGeneration.ts#270-329
3. Recent turns are appended until the budget is exhausted; older ones become "dropped messages".
4. Dropped messages are summarized and the digest is saved back to the pub/sub store with the current conversation ID. @client/modules/textGeneration.ts#313-330
5. The final prompt sent to the model always starts with the refreshed system prompt, followed by the stub assistant reply and the kept turn list to encourage immediate streaming.
## Settings & Extensibility
- All inference types share the same summarization contract—no provider-specific logic beyond selecting the backend module at runtime. @client/modules/textGeneration.ts#95-135
- Changing the global context window (e.g., via OpenAI settings) automatically affects the available budget because the logic derives from the default context size exported by `textGenerationUtilities`. @client/modules/textGenerationUtilities.ts#13-74
- Future settings (e.g., toggling memory or adjusting the 75% ratio) should hook into the same budgeting helpers to keep behavior predictable.
## Failure Modes & Logging
- Every summarization attempt is wrapped in try/catch; failures emit `addLogEntry` notifications and fall back to extractive summaries so the chat loop never stalls. @client/modules/textGeneration.ts#97-138
- If generation is interrupted (user stop), a custom `ChatGenerationError` ensures the loop exits gracefully without corrupting the stored summary. @client/modules/textGeneration.ts#360-369 @client/modules/textGenerationUtilities.ts#19-26
## Reset Rules
- Starting a new top-level search clears the summary, chat history, and cached results to avoid context leakage across unrelated conversations. @client/modules/textGeneration.ts#179-207
- Restoring a run from history repopulates chat state from IndexedDB; the memory system will rebuild summaries on demand once the user resumes chatting. @client/modules/history.ts#335-365 @client/hooks/useHistoryRestore.ts#32-105
## Related Topics
- **AI Integration**: `docs/ai-integration.md` - Detailed inference options
- **Search History**: `docs/search-history.md` - History and persistence
- **Overview**: `docs/overview.md` - System architecture
- **Configuration**: `docs/configuration.md` - Settings for context window