RFTSystems's picture
Update README.md
09c4694 verified
---
title: RFT Memory Receipt Engine
emoji: 🧾
colorFrom: indigo
colorTo: gray
sdk: gradio
sdk_version: 6.2.0
python_version: '3.10'
app_file: app.py
pinned: false
license: mit
tags:
- gradio
- agents
- rag
- retrieval
- memory
- sqlite
- observability
- reproducibility
thumbnail: >-
https://cdn-uploads.huggingface.co/production/uploads/685edcb04796127b024b4805/ve66mRYJDgMQ0QKhB0YZG.png
---
# RFT Memory Receipt Engine (Local Persistence + Verifiable Retrieval)
I built this Space to solve a problem most “agent memory” systems avoid: **you can’t trust what you can’t verify**. Persisting chat history is easy. Proving what actually influenced an output is the hard part.
This Space is a local persistence engine for agents and chat systems that:
- stores every turn as an **append-only event log**
- indexes it for **fast retrieval** (SQLite FTS)
- generates a **cryptographic receipt** for every assistant turn that lists the exact memory slices used
- verifies receipts by checking event hashes and chain integrity
The result is durable memory with **audit-grade lineage**.
---
## What this Space demonstrates
### 1) Durable session memory (outside the model context)
- Every message is written as an event to an append-only JSONL log.
- Sessions persist across restarts when you store them on persistent disk.
### 2) Targeted retrieval instead of full history replay
- Rather than replaying an ever-growing transcript, you retrieve a fixed number of relevant memory slices per turn.
- Retrieval is lexical (FTS) in this version for maximum reliability and zero embedding dependencies.
### 3) Memory receipts (provable continuity)
Each assistant turn produces a receipt that contains:
- the user query
- the retrieved events (IDs + text snippets)
- the cryptographic digests of those events
- the chain hash that proves their position in the append-only ledger
- prompt hash + response hash for end-to-end traceability
You can upload a receipt back into the Space and verify it.
---
## Core design (RFT-aligned)
### Append-only ledger + hash chain
Each event is hashed, then chained to the prior event:
- `digest = sha256(canonical_event_payload)`
- `chain_hash = sha256(prev_chain_hash + digest)`
This gives you tamper-evidence across the entire session history.
### Collapse scoring (memory promotion signal)
Events are assigned a lightweight “collapse score” that estimates long-term value using novelty + role weighting. This is designed to help separate noise from signal as sessions grow.
### Fixed retrieval budget
Retrieval count `K` is a hard control knob. This is the practical mechanism that keeps prompts stable as sessions age and prevents context bloat.
---
## User interface
### Chat
- Write events (user + assistant)
- Retrieve top-K relevant memories
- Save a receipt for the turn
### Manual Search
- Query the session memory directly
- Inspect matching events and their hashes
### Verify a Receipt
- Upload a receipt JSON file
- Verify that every referenced event exists in the session and that all digests and chain hashes match
---
## On-disk layout
All data lives under a single base directory:
- `index.sqlite` holds:
- `events` table
- `events_fts` FTS5 index
- `receipts` metadata
- `sessions/<session_id>/events.jsonl` is the append-only source of truth
- `sessions/<session_id>/receipts/<receipt_id>.json` stores receipts
---
## Running locally
```bash
pip install -r requirements.txt
python app.py