AsadIsmail commited on
Commit
7f94cde
·
verified ·
1 Parent(s): 53abcbd

Publish PRISM-Memory adapter bundle

Browse files
README.md CHANGED
@@ -137,6 +137,7 @@ More held-out examples live in
137
  - [docs/release/datasets.md](docs/release/datasets.md)
138
  - [docs/release/extraction-examples.md](docs/release/extraction-examples.md)
139
  - [docs/release/extraction-skill.md](docs/release/extraction-skill.md)
 
140
  - [docs/release/release-results.md](docs/release/release-results.md)
141
  - [docs/release/technical-blog.md](docs/release/technical-blog.md)
142
  - [results/confirmed_exp15_summary.json](results/confirmed_exp15_summary.json)
 
137
  - [docs/release/datasets.md](docs/release/datasets.md)
138
  - [docs/release/extraction-examples.md](docs/release/extraction-examples.md)
139
  - [docs/release/extraction-skill.md](docs/release/extraction-skill.md)
140
+ - [docs/release/memory-scenarios.md](docs/release/memory-scenarios.md)
141
  - [docs/release/release-results.md](docs/release/release-results.md)
142
  - [docs/release/technical-blog.md](docs/release/technical-blog.md)
143
  - [results/confirmed_exp15_summary.json](results/confirmed_exp15_summary.json)
docs/release/memory-scenarios.md ADDED
@@ -0,0 +1,105 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # PRISM-Memory End-To-End Scenarios
2
+
3
+ These are compact product-style scenarios built from the public release
4
+ artifacts.
5
+
6
+ - The first two use the released held-out extraction examples.
7
+ - The last two use confirmed held-out benchmark cases from
8
+ [../../results/scenario_comparisons.json](../../results/scenario_comparisons.json).
9
+
10
+ The point is not just that the extractor matches GPT-4.1-style labels. The
11
+ point is that a later system can ask a concrete question and get back a useful,
12
+ inspectable answer from stored memory.
13
+
14
+ ## 1. Keep hard limits and notification preferences
15
+
16
+ **Conversation turn**
17
+
18
+ > yeah, I think starting with incremental scans and parallel matrix jobs makes sense. We have 20 concurrent jobs max on GitHub Actions currently. Also want to keep Slack notifications from Snyk consistent with other pipeline alerts, aggregated and concise.
19
+
20
+ **Stored memory**
21
+
22
+ - GitHub Actions concurrency limit: 20 concurrent jobs
23
+ - Snyk Slack notifications should be aggregated and concise
24
+
25
+ **Later question**
26
+
27
+ What is our GitHub Actions concurrency limit, and how should Snyk alerts look?
28
+
29
+ **Answer from memory**
30
+
31
+ 20 concurrent jobs. Snyk alerts should be aggregated and concise.
32
+
33
+ **Why it matters**
34
+
35
+ This is the kind of operational detail that gets buried in chat but needs to
36
+ survive into later workflow drafts and agent actions.
37
+
38
+ ## 2. Keep current state separate from the roadmap
39
+
40
+ **Conversation turn**
41
+
42
+ > yeah good point about resource overhead, we set CPU limits for all sidecars and monitor with Prometheus now. no mTLS yet, but it’s on the roadmap for phase two. as for routing, we want to start with canary deployments and traffic splitting, maybe some basic fault injection for testing.
43
+
44
+ **Stored memory**
45
+
46
+ - Sidecar CPU limits set and monitored via Prometheus
47
+ - Istio mTLS planned for phase two
48
+ - Routing strategy: canary deployments and traffic splitting; basic fault injection planned
49
+
50
+ **Later question**
51
+
52
+ Did we already enable mTLS, and what rollout strategy are we planning?
53
+
54
+ **Answer from memory**
55
+
56
+ mTLS is not enabled yet; it is planned for phase two. The rollout plan is
57
+ canary deployments and traffic splitting, with basic fault injection planned.
58
+
59
+ **Why it matters**
60
+
61
+ Memory systems often blur the current state with the planned state. This is the
62
+ kind of distinction that matters in deployment and incident work.
63
+
64
+ ## 3. Answer dated questions instead of only remembering themes
65
+
66
+ **Question**
67
+
68
+ Which hobby did Sam take up in May 2023?
69
+
70
+ **Retrieved memory**
71
+
72
+ - Sam: [18 May 2023] Sam is considering trying painting as a new hobby.
73
+ - Sam: [24 May 2023] Sam has been considering trying painting as a new hobby.
74
+
75
+ **Answer from memory**
76
+
77
+ painting
78
+
79
+ **Why it matters**
80
+
81
+ A useful memory system should not just remember that someone talked about
82
+ hobbies. It should recover the dated fact that actually answers the later
83
+ question.
84
+
85
+ ## 4. Refuse unsupported claims instead of inventing a reason
86
+
87
+ **Question**
88
+
89
+ Why did Dave get his guitar customized with a shiny finish?
90
+
91
+ **Retrieved memory**
92
+
93
+ - Dave: That guitar has a gorgeous purple hue. Why did you make it so shiny?
94
+ - Good pick! The customized purple glow gives it a unique look that really stands out.
95
+ - Dave: The guitar was in bad condition when Dave found it.
96
+
97
+ **Answer from memory**
98
+
99
+ None / unsupported
100
+
101
+ **Why it matters**
102
+
103
+ Memory systems are more useful when they can refuse cleanly. Here the retrieved
104
+ context talks about the guitar and the finish, but it never actually supports
105
+ the premise that Dave customized it for a specific reason.
docs/release/technical-blog.md CHANGED
@@ -6,6 +6,9 @@
6
  dialogue into proposition-level memory and retrieves it with an inspectable
7
  hybrid stack.
8
 
 
 
 
9
  This package now ships one public extraction skill and one public checkpoint:
10
 
11
  - **Checkpoint:** `exp15_sft_qwen7b_4ep`
@@ -17,6 +20,70 @@ The public hook is simple:
17
 
18
  **PRISM-Memory turns conversations into durable, searchable memory.**
19
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  ## What The Repo Actually Contributed
21
 
22
  The core contribution is not another opaque memory model. The repo showed that a
 
6
  dialogue into proposition-level memory and retrieves it with an inspectable
7
  hybrid stack.
8
 
9
+ The point is not that a 7B model chats well. The point is that a 7B open model
10
+ can write memory records that another system can actually use later.
11
+
12
  This package now ships one public extraction skill and one public checkpoint:
13
 
14
  - **Checkpoint:** `exp15_sft_qwen7b_4ep`
 
20
 
21
  **PRISM-Memory turns conversations into durable, searchable memory.**
22
 
23
+ ## Why This Is Useful In Practice
24
+
25
+ A memory writer is only interesting if a later system can ask a pointed
26
+ question and get back a useful answer without rereading the original chat. The
27
+ public release artifacts already show that pattern.
28
+
29
+ ### 1. Keep hard limits and preferences available for later work
30
+
31
+ The extractor can turn a single conversational turn into stable memory like:
32
+
33
+ - GitHub Actions concurrency limit: `20` concurrent jobs
34
+ - Snyk Slack notifications should be aggregated and concise
35
+
36
+ That means a later system can answer:
37
+
38
+ > What is our GitHub Actions concurrency limit, and how should Snyk alerts look?
39
+
40
+ with:
41
+
42
+ > `20` concurrent jobs. Alerts should be aggregated and concise.
43
+
44
+ That is a real product use case. Teams mention constraints and preferences once,
45
+ then expect downstream tools and agents to remember them.
46
+
47
+ ### 2. Keep current state separate from the roadmap
48
+
49
+ The released extractor can also preserve the difference between what is true
50
+ now and what is only planned:
51
+
52
+ - sidecar CPU limits are already set and monitored
53
+ - mTLS is planned for phase two
54
+ - rollout strategy is canary deployments plus traffic splitting
55
+
56
+ So a later question like:
57
+
58
+ > Did we already enable mTLS, and what rollout strategy are we planning?
59
+
60
+ can be answered without confusing the present state with the future plan.
61
+
62
+ This is a core memory problem, not a style problem. Chat history tends to blur
63
+ these states together.
64
+
65
+ ### 3. Answer dated questions with dated evidence
66
+
67
+ One confirmed held-out benchmark case asks:
68
+
69
+ > Which hobby did Sam take up in May 2023?
70
+
71
+ The retrieved memory contains explicit dated propositions about Sam trying
72
+ painting in May 2023, and the released system answers:
73
+
74
+ > painting
75
+
76
+ That matters because the useful behavior is not “remember that hobbies were
77
+ discussed.” The useful behavior is “recover the dated fact that actually
78
+ answers the later question.”
79
+
80
+ There is a fourth practical behavior that matters too: refusal. On the held-out
81
+ adversarial guitar case, the released model returns `None` instead of inventing
82
+ a reason for an unsupported premise. That is also part of being useful.
83
+
84
+ For the compact scenario version of this story, see
85
+ [memory-scenarios.md](memory-scenarios.md).
86
+
87
  ## What The Repo Actually Contributed
88
 
89
  The core contribution is not another opaque memory model. The repo showed that a