Spaces:
Running
Running
File size: 5,552 Bytes
12a6c9a 4d5727a 12a6c9a 4d5727a 12a6c9a 4d5727a 12a6c9a 4d5727a 12a6c9a 4d5727a | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 | # agentcache-python β Architecture
## Overview
A Python REST + WebSocket + MCP memory server backed by SQLite.
Agents store observations scoped by `(folderPath, agentId)` pairs and retrieve
context at session start. No Node.js, no external database, no build step.
---
## Module Responsibilities
```
src/
βββ app.py Flask application factory (create_app).
β Initialises DB, embeddings, blueprints, WebSocket,
β CORS hook, and background workers.
β
βββ routes/ Flask blueprints β one per domain area.
β βββ __init__.py register_blueprints(app) helper.
β βββ observations.py /observe, /agent/observe, /folders, /folder/observations
β βββ memories.py /remember, /agent/remember, /memories, /forget
β βββ search.py /search, /timeline
β βββ graph.py /graph, /graph/stats, /graph/query, /graph/build
β βββ health.py /livez, /health, /audit, /config/flags
β βββ mcp.py GET+POST /mcp/tools
β βββ migration.py /migrate
β
βββ functions.py Core business logic.
β observe(), folder_observe(), remember(), forget(),
β folder_search(), folder_timeline(), health_check(),
β export_data(), rebuild_index(), auto_forget(),
β folder_graph_build(), KV scope registry.
β
βββ db.py StateKV β SQLite wrapper (WAL mode, kv_store + audit_log).
β
βββ search.py SearchIndex (BM25 + Porter stemmer + synonyms),
β VectorIndex (cosine similarity, base64 float32),
β GeminiEmbeddingProvider, HybridSearch (RRF).
β
βββ workers.py Daemon threads: index rebuild, auto-forget sweep,
β graceful shutdown (SIGTERM/SIGINT).
β
βββ viewer_helpers.py make_viewer_response() β reads viewer/index.html,
β injects nonce + version, sets CSP headers.
β
βββ mcp_stdio.py stdio MCP bridge: reads AGENTCACHE_URL and
β AGENTCACHE_SECRET, proxies tool calls to the HTTP API.
β
βββ viewer/
βββ index.html Single-file HTML dashboard (no bundler).
```
---
## KV Scope Layout
All data lives in a single SQLite file (`~/.agentcache/agentcache.db`) in two tables:
- `kv_store(scope TEXT, key TEXT, value TEXT, PRIMARY KEY(scope, key))` β JSON values
- `audit_log(id, ts, agent_id, message)` β write audit trail
| Scope | Content |
|-------|---------|
| `mem:folders` | Global index of all `(folderPath, agentId)` pairs |
| `mem:folder:{path}:{agent}` | Observations for one `(folder, agent)` pair |
| `mem:foldermeta:{path}:{agent}` | Metadata for one pair (obsCount, lastUpdated, summary) |
| `mem:memories` | Long-term global memories |
| `mem:index:bm25` | BM25 index shards (manifest + data chunks) |
| `mem:audit` | Audit log entries (via `record_audit()`) |
| `mem:relations` | Knowledge graph edges |
| `mem:sessions` | Legacy session objects (read-only for migration) |
| `mem:obs:{session_id}` | Legacy session observations (read-only for migration) |
---
## Data Flow
### Observation Ingestion
```
POST /agent/observe
βββΊ folder_observe(kv, payload)
1. Validate folderPath, agentId, text, timestamp
2. normalize_folder_path() + validate_agent_id()
3. strip_private_data()
4. Cap text at 4000 chars
5. Enforce MAX_OBS_PER_FOLDER cap
6. Generate obs_id (fobs_...)
7. Write FolderObservation to KV.folder_obs(fp, aid)
8. Upsert folder metadata (KV.folder_meta)
9. Upsert global folders index (KV.folders)
10. Add to BM25 index (_bm25_index.add)
11. Add to vector index if embedding provider is set
12. Debounce persistence save (IndexPersistence.schedule_save)
13. Write audit log entry (kv.commit_version)
14. Broadcast via WebSocket (/stream/mem-live/viewer)
βββΊ return {"observationId": obs_id}
```
### Search
```
POST /search or POST /mcp/tools {name:"memory_recall"}
βββΊ folder_search(kv, query, limit, folderPath?, agentId?)
1. HybridSearch.search() β BM25 + vector RRF fusion
2. Load all (folder, agent) pairs from KV.folders
3. Hydrate obs_ids from KV.folder_obs scopes
4. Apply folderPath/agentId post-filters
5. Also include matching global memories
6. Sort by score descending, cap at limit
```
### Memory Versioning
`remember()` scans existing memories for Jaccard similarity > 0.7.
If found, old memory is marked `isLatest=False` and new memory sets `parentId`.
---
## Authentication
All endpoints except `/livez` check `AGENTCACHE_SECRET` via timing-safe
`hmac.compare_digest` Bearer token comparison. No secret β no auth check.
---
## WebSocket
`/stream/mem-live/viewer` broadcasts raw JSON payloads to connected viewers.
The viewer's "Replay" tab subscribes to this stream for live observation updates.
---
## Embedding Providers
Priority order (auto-selected at startup):
1. `GeminiEmbeddingProvider` β if `GEMINI_API_KEY` or `GOOGLE_API_KEY` is set (768 dims)
2. `OpenAIEmbeddingProvider` β if `OPENAI_API_KEY` is set (1536 dims)
3. `SentenceTransformerProvider` β if `AGENTCACHE_LOCAL_EMBEDDING_MODEL` is set
4. BM25-only fallback
Without an embedding provider, `HybridSearch` falls back to pure BM25.
|