AsadIsmail commited on
Commit
53abcbd
·
verified ·
1 Parent(s): 047d480

Publish PRISM-Memory adapter bundle

Browse files
Files changed (2) hide show
  1. README.md +17 -2
  2. docs/release/datasets.md +46 -0
README.md CHANGED
@@ -50,8 +50,9 @@ This repo contains the adapter weights only. You still need the base model.
50
 
51
  ## Training data
52
 
53
- PRISM-Memory was trained on GPT-4.1-derived proposition labels over synthetic
54
- multi-session memory conversations.
 
55
 
56
  | File | Examples | Role |
57
  |---|---:|---|
@@ -64,6 +65,20 @@ The released checkpoint uses a `20k` sample from `train_sft.jsonl`. See
64
  [docs/release/datasets.md](docs/release/datasets.md) for the full inventory,
65
  the evaluation surfaces, and the ablations that regressed.
66
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
67
  ## Confirmed results
68
 
69
  | Benchmark | PRISM-Memory | GPT-4.1-based PropMem reference |
 
50
 
51
  ## Training data
52
 
53
+ PRISM-Memory was trained on **synthetic** multi-session memory conversations
54
+ with **GPT-4.1-derived proposition labels**. The public release does not use
55
+ real user chat logs.
56
 
57
  | File | Examples | Role |
58
  |---|---:|---|
 
65
  [docs/release/datasets.md](docs/release/datasets.md) for the full inventory,
66
  the evaluation surfaces, and the ablations that regressed.
67
 
68
+ ### Example data item
69
+
70
+ **Synthetic turn**
71
+
72
+ > yeah, I think starting with incremental scans and parallel matrix jobs makes sense. We have 20 concurrent jobs max on GitHub Actions currently. Also want to keep Slack notifications from Snyk consistent with other pipeline alerts, aggregated and concise.
73
+
74
+ **Target propositions**
75
+
76
+ - GitHub Actions concurrency limit: 20 concurrent jobs
77
+ - Wants Snyk Slack notifications aggregated and concise, consistent with other pipeline alerts
78
+
79
+ The current release makes the data recipe and examples public. The full raw
80
+ training JSONLs are not bundled in this model repo.
81
+
82
  ## Confirmed results
83
 
84
  | Benchmark | PRISM-Memory | GPT-4.1-based PropMem reference |
docs/release/datasets.md CHANGED
@@ -3,6 +3,17 @@
3
  This file separates the data used by the public `PRISM-Memory` release from the
4
  auxiliary datasets that were only useful for ablations.
5
 
 
 
 
 
 
 
 
 
 
 
 
6
  ## Released Training Recipe
7
 
8
  The released checkpoint is `exp15_sft_qwen7b_4ep`.
@@ -27,6 +38,24 @@ The underlying synthetic conversation source lives in the upstream
27
  The source generator was built to create long-horizon memory stress cases with
28
  inserts, updates, deletes, and multi-session recall.
29
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
  ## Derived SFT Data
31
 
32
  These are GPT-4.1-derived proposition labels built on top of the raw
@@ -51,6 +80,23 @@ The released model was evaluated on two held-out surfaces:
51
  Both the GPT-4.1 extraction baseline and the released 7B extractor were scored
52
  with the same GPT-4.1 QA evaluator and the same cache-backed answer surface.
53
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54
  ## Auxiliary LoCoMo Datasets
55
 
56
  These files were used in ablations and targeted probes. They matter for the
 
3
  This file separates the data used by the public `PRISM-Memory` release from the
4
  auxiliary datasets that were only useful for ablations.
5
 
6
+ ## Data Provenance
7
+
8
+ The release training data is **synthetic**.
9
+
10
+ - The conversation source was programmatically generated to stress long-horizon
11
+ memory behavior such as inserts, updates, deletes, contradiction handling,
12
+ and multi-session recall.
13
+ - The SFT labels were then derived from those synthetic conversations with a
14
+ GPT-4.1 proposition extractor.
15
+ - No real end-user chat logs are part of this public release story.
16
+
17
  ## Released Training Recipe
18
 
19
  The released checkpoint is `exp15_sft_qwen7b_4ep`.
 
38
  The source generator was built to create long-horizon memory stress cases with
39
  inserts, updates, deletes, and multi-session recall.
40
 
41
+ ## Example Training Item
42
+
43
+ This is the shape of the data the model learned from: a synthetic dialogue turn
44
+ paired with proposition-style extraction targets.
45
+
46
+ **Synthetic turn**
47
+
48
+ > yeah, I think starting with incremental scans and parallel matrix jobs makes sense. We have 20 concurrent jobs max on GitHub Actions currently. Also want to keep Slack notifications from Snyk consistent with other pipeline alerts, aggregated and concise.
49
+
50
+ **Target propositions**
51
+
52
+ - GitHub Actions concurrency limit: 20 concurrent jobs
53
+ - Wants Snyk Slack notifications aggregated and concise, consistent with other pipeline alerts
54
+
55
+ This example is illustrative of the release data format. The exact public
56
+ release checkpoint was trained on the larger `train_sft.jsonl` corpus, not on
57
+ just this slice.
58
+
59
  ## Derived SFT Data
60
 
61
  These are GPT-4.1-derived proposition labels built on top of the raw
 
80
  Both the GPT-4.1 extraction baseline and the released 7B extractor were scored
81
  with the same GPT-4.1 QA evaluator and the same cache-backed answer surface.
82
 
83
+ ## What Is Public Right Now
84
+
85
+ Public now:
86
+
87
+ - dataset description and counts
88
+ - held-out extraction examples
89
+ - release metrics and benchmark breakdowns
90
+
91
+ Not public yet:
92
+
93
+ - the raw `train.jsonl` and `eval.jsonl` conversation files
94
+ - the full `train_sft.jsonl` and `train_sft_clean_merged.jsonl` label files
95
+ - the auxiliary LoCoMo ablation JSONLs
96
+
97
+ So the current release makes the **data recipe** public, but not the full raw
98
+ training corpora.
99
+
100
  ## Auxiliary LoCoMo Datasets
101
 
102
  These files were used in ablations and targeted probes. They matter for the