Spaces:
Sleeping
Sleeping
| # agentcache-python β Architecture | |
| ## Overview | |
| A Python REST + WebSocket + MCP memory server backed by SQLite. | |
| Agents store observations scoped by `(folderPath, agentId)` pairs and retrieve | |
| context at session start. No Node.js, no external database, no build step. | |
| --- | |
| ## Module Responsibilities | |
| ``` | |
| src/ | |
| βββ app.py Flask application factory (create_app). | |
| β Initialises DB, embeddings, blueprints, WebSocket, | |
| β CORS hook, and background workers. | |
| β | |
| βββ routes/ Flask blueprints β one per domain area. | |
| β βββ __init__.py register_blueprints(app) helper. | |
| β βββ observations.py /observe, /agent/observe, /folders, /folder/observations | |
| β βββ memories.py /remember, /agent/remember, /memories, /forget | |
| β βββ search.py /search, /timeline | |
| β βββ graph.py /graph, /graph/stats, /graph/query, /graph/build | |
| β βββ health.py /livez, /health, /audit, /config/flags | |
| β βββ mcp.py GET+POST /mcp/tools | |
| β βββ migration.py /migrate | |
| β | |
| βββ functions.py Core business logic. | |
| β observe(), folder_observe(), remember(), forget(), | |
| β folder_search(), folder_timeline(), health_check(), | |
| β export_data(), rebuild_index(), auto_forget(), | |
| β folder_graph_build(), KV scope registry. | |
| β | |
| βββ db.py StateKV β SQLite wrapper (WAL mode, kv_store + audit_log). | |
| β | |
| βββ search.py SearchIndex (BM25 + Porter stemmer + synonyms), | |
| β VectorIndex (cosine similarity, base64 float32), | |
| β GeminiEmbeddingProvider, HybridSearch (RRF). | |
| β | |
| βββ workers.py Daemon threads: index rebuild, auto-forget sweep, | |
| β graceful shutdown (SIGTERM/SIGINT). | |
| β | |
| βββ viewer_helpers.py make_viewer_response() β reads viewer/index.html, | |
| β injects nonce + version, sets CSP headers. | |
| β | |
| βββ mcp_stdio.py stdio MCP bridge: reads AGENTCACHE_URL and | |
| β AGENTCACHE_SECRET, proxies tool calls to the HTTP API. | |
| β | |
| βββ viewer/ | |
| βββ index.html Single-file HTML dashboard (no bundler). | |
| ``` | |
| --- | |
| ## KV Scope Layout | |
| All data lives in a single SQLite file (`~/.agentcache/agentcache.db`) in two tables: | |
| - `kv_store(scope TEXT, key TEXT, value TEXT, PRIMARY KEY(scope, key))` β JSON values | |
| - `audit_log(id, ts, agent_id, message)` β write audit trail | |
| | Scope | Content | | |
| |-------|---------| | |
| | `mem:folders` | Global index of all `(folderPath, agentId)` pairs | | |
| | `mem:folder:{path}:{agent}` | Observations for one `(folder, agent)` pair | | |
| | `mem:foldermeta:{path}:{agent}` | Metadata for one pair (obsCount, lastUpdated, summary) | | |
| | `mem:memories` | Long-term global memories | | |
| | `mem:index:bm25` | BM25 index shards (manifest + data chunks) | | |
| | `mem:audit` | Audit log entries (via `record_audit()`) | | |
| | `mem:relations` | Knowledge graph edges | | |
| | `mem:sessions` | Legacy session objects (read-only for migration) | | |
| | `mem:obs:{session_id}` | Legacy session observations (read-only for migration) | | |
| --- | |
| ## Data Flow | |
| ### Observation Ingestion | |
| ``` | |
| POST /agent/observe | |
| βββΊ folder_observe(kv, payload) | |
| 1. Validate folderPath, agentId, text, timestamp | |
| 2. normalize_folder_path() + validate_agent_id() | |
| 3. strip_private_data() | |
| 4. Cap text at 4000 chars | |
| 5. Enforce MAX_OBS_PER_FOLDER cap | |
| 6. Generate obs_id (fobs_...) | |
| 7. Write FolderObservation to KV.folder_obs(fp, aid) | |
| 8. Upsert folder metadata (KV.folder_meta) | |
| 9. Upsert global folders index (KV.folders) | |
| 10. Add to BM25 index (_bm25_index.add) | |
| 11. Add to vector index if embedding provider is set | |
| 12. Debounce persistence save (IndexPersistence.schedule_save) | |
| 13. Write audit log entry (kv.commit_version) | |
| 14. Broadcast via WebSocket (/stream/mem-live/viewer) | |
| βββΊ return {"observationId": obs_id} | |
| ``` | |
| ### Search | |
| ``` | |
| POST /search or POST /mcp/tools {name:"memory_recall"} | |
| βββΊ folder_search(kv, query, limit, folderPath?, agentId?) | |
| 1. HybridSearch.search() β BM25 + vector RRF fusion | |
| 2. Load all (folder, agent) pairs from KV.folders | |
| 3. Hydrate obs_ids from KV.folder_obs scopes | |
| 4. Apply folderPath/agentId post-filters | |
| 5. Also include matching global memories | |
| 6. Sort by score descending, cap at limit | |
| ``` | |
| ### Memory Versioning | |
| `remember()` scans existing memories for Jaccard similarity > 0.7. | |
| If found, old memory is marked `isLatest=False` and new memory sets `parentId`. | |
| --- | |
| ## Authentication | |
| All endpoints except `/livez` check `AGENTCACHE_SECRET` via timing-safe | |
| `hmac.compare_digest` Bearer token comparison. No secret β no auth check. | |
| --- | |
| ## WebSocket | |
| `/stream/mem-live/viewer` broadcasts raw JSON payloads to connected viewers. | |
| The viewer's "Replay" tab subscribes to this stream for live observation updates. | |
| --- | |
| ## Embedding Providers | |
| Priority order (auto-selected at startup): | |
| 1. `GeminiEmbeddingProvider` β if `GEMINI_API_KEY` or `GOOGLE_API_KEY` is set (768 dims) | |
| 2. `OpenAIEmbeddingProvider` β if `OPENAI_API_KEY` is set (1536 dims) | |
| 3. `SentenceTransformerProvider` β if `AGENTCACHE_LOCAL_EMBEDDING_MODEL` is set | |
| 4. BM25-only fallback | |
| Without an embedding provider, `HybridSearch` falls back to pure BM25. | |