RFTSystems commited on
Commit
694eee4
·
verified ·
1 Parent(s): 4d7d528

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +104 -7
README.md CHANGED
@@ -1,14 +1,111 @@
1
  ---
2
  title: RFT Memory Receipt Engine
3
- emoji: 🌍
4
- colorFrom: red
5
- colorTo: purple
6
  sdk: gradio
7
- sdk_version: 6.2.0
 
8
  app_file: app.py
9
  pinned: false
10
- license: other
11
- short_description: 'demo: JSONL ledger + SQLite FTS + receipts + verification.'
 
 
 
 
 
 
 
 
12
  ---
13
 
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: RFT Memory Receipt Engine
3
+ emoji: 🧾
4
+ colorFrom: indigo
5
+ colorTo: gray
6
  sdk: gradio
7
+ sdk_version: 4.44.0
8
+ python_version: 3.10
9
  app_file: app.py
10
  pinned: false
11
+ license: mit
12
+ tags:
13
+ - gradio
14
+ - agents
15
+ - rag
16
+ - retrieval
17
+ - memory
18
+ - sqlite
19
+ - observability
20
+ - reproducibility
21
  ---
22
 
23
+ # RFT Memory Receipt Engine (Local Persistence + Verifiable Retrieval)
24
+
25
+ I built this Space to solve a problem most “agent memory” systems avoid: **you can’t trust what you can’t verify**. Persisting chat history is easy. Proving what actually influenced an output is the hard part.
26
+
27
+ This Space is a local persistence engine for agents and chat systems that:
28
+ - stores every turn as an **append-only event log**
29
+ - indexes it for **fast retrieval** (SQLite FTS)
30
+ - generates a **cryptographic receipt** for every assistant turn that lists the exact memory slices used
31
+ - verifies receipts by checking event hashes and chain integrity
32
+
33
+ The result is durable memory with **audit-grade lineage**.
34
+
35
+ ---
36
+
37
+ ## What this Space demonstrates
38
+
39
+ ### 1) Durable session memory (outside the model context)
40
+ - Every message is written as an event to an append-only JSONL log.
41
+ - Sessions persist across restarts when you store them on persistent disk.
42
+
43
+ ### 2) Targeted retrieval instead of full history replay
44
+ - Rather than replaying an ever-growing transcript, you retrieve a fixed number of relevant memory slices per turn.
45
+ - Retrieval is lexical (FTS) in this version for maximum reliability and zero embedding dependencies.
46
+
47
+ ### 3) Memory receipts (provable continuity)
48
+ Each assistant turn produces a receipt that contains:
49
+ - the user query
50
+ - the retrieved events (IDs + text snippets)
51
+ - the cryptographic digests of those events
52
+ - the chain hash that proves their position in the append-only ledger
53
+ - prompt hash + response hash for end-to-end traceability
54
+
55
+ You can upload a receipt back into the Space and verify it.
56
+
57
+ ---
58
+
59
+ ## Core design (RFT-aligned)
60
+
61
+ ### Append-only ledger + hash chain
62
+ Each event is hashed, then chained to the prior event:
63
+
64
+ - `digest = sha256(canonical_event_payload)`
65
+ - `chain_hash = sha256(prev_chain_hash + digest)`
66
+
67
+ This gives you tamper-evidence across the entire session history.
68
+
69
+ ### Collapse scoring (memory promotion signal)
70
+ Events are assigned a lightweight “collapse score” that estimates long-term value using novelty + role weighting. This is designed to help separate noise from signal as sessions grow.
71
+
72
+ ### Fixed retrieval budget
73
+ Retrieval count `K` is a hard control knob. This is the practical mechanism that keeps prompts stable as sessions age and prevents context bloat.
74
+
75
+ ---
76
+
77
+ ## User interface
78
+
79
+ ### Chat
80
+ - Write events (user + assistant)
81
+ - Retrieve top-K relevant memories
82
+ - Save a receipt for the turn
83
+
84
+ ### Manual Search
85
+ - Query the session memory directly
86
+ - Inspect matching events and their hashes
87
+
88
+ ### Verify a Receipt
89
+ - Upload a receipt JSON file
90
+ - Verify that every referenced event exists in the session and that all digests and chain hashes match
91
+
92
+ ---
93
+
94
+ ## On-disk layout
95
+
96
+ All data lives under a single base directory:
97
+
98
+ - `index.sqlite` holds:
99
+ - `events` table
100
+ - `events_fts` FTS5 index
101
+ - `receipts` metadata
102
+ - `sessions/<session_id>/events.jsonl` is the append-only source of truth
103
+ - `sessions/<session_id>/receipts/<receipt_id>.json` stores receipts
104
+
105
+ ---
106
+
107
+ ## Running locally
108
+
109
+ ```bash
110
+ pip install -r requirements.txt
111
+ python app.py