Text Generation
PEFT
Safetensors
conversational-memory
information-extraction
long-context
lora
qwen2.5
conversational
Instructions to use AsadIsmail/prism-memory with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use AsadIsmail/prism-memory with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-7B-Instruct") model = PeftModel.from_pretrained(base_model, "AsadIsmail/prism-memory") - Notebooks
- Google Colab
- Kaggle
Publish PRISM-Memory adapter bundle
Browse files- README.md +17 -2
- docs/release/datasets.md +46 -0
README.md
CHANGED
|
@@ -50,8 +50,9 @@ This repo contains the adapter weights only. You still need the base model.
|
|
| 50 |
|
| 51 |
## Training data
|
| 52 |
|
| 53 |
-
PRISM-Memory was trained on
|
| 54 |
-
|
|
|
|
| 55 |
|
| 56 |
| File | Examples | Role |
|
| 57 |
|---|---:|---|
|
|
@@ -64,6 +65,20 @@ The released checkpoint uses a `20k` sample from `train_sft.jsonl`. See
|
|
| 64 |
[docs/release/datasets.md](docs/release/datasets.md) for the full inventory,
|
| 65 |
the evaluation surfaces, and the ablations that regressed.
|
| 66 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 67 |
## Confirmed results
|
| 68 |
|
| 69 |
| Benchmark | PRISM-Memory | GPT-4.1-based PropMem reference |
|
|
|
|
| 50 |
|
| 51 |
## Training data
|
| 52 |
|
| 53 |
+
PRISM-Memory was trained on **synthetic** multi-session memory conversations
|
| 54 |
+
with **GPT-4.1-derived proposition labels**. The public release does not use
|
| 55 |
+
real user chat logs.
|
| 56 |
|
| 57 |
| File | Examples | Role |
|
| 58 |
|---|---:|---|
|
|
|
|
| 65 |
[docs/release/datasets.md](docs/release/datasets.md) for the full inventory,
|
| 66 |
the evaluation surfaces, and the ablations that regressed.
|
| 67 |
|
| 68 |
+
### Example data item
|
| 69 |
+
|
| 70 |
+
**Synthetic turn**
|
| 71 |
+
|
| 72 |
+
> yeah, I think starting with incremental scans and parallel matrix jobs makes sense. We have 20 concurrent jobs max on GitHub Actions currently. Also want to keep Slack notifications from Snyk consistent with other pipeline alerts, aggregated and concise.
|
| 73 |
+
|
| 74 |
+
**Target propositions**
|
| 75 |
+
|
| 76 |
+
- GitHub Actions concurrency limit: 20 concurrent jobs
|
| 77 |
+
- Wants Snyk Slack notifications aggregated and concise, consistent with other pipeline alerts
|
| 78 |
+
|
| 79 |
+
The current release makes the data recipe and examples public. The full raw
|
| 80 |
+
training JSONLs are not bundled in this model repo.
|
| 81 |
+
|
| 82 |
## Confirmed results
|
| 83 |
|
| 84 |
| Benchmark | PRISM-Memory | GPT-4.1-based PropMem reference |
|
docs/release/datasets.md
CHANGED
|
@@ -3,6 +3,17 @@
|
|
| 3 |
This file separates the data used by the public `PRISM-Memory` release from the
|
| 4 |
auxiliary datasets that were only useful for ablations.
|
| 5 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
## Released Training Recipe
|
| 7 |
|
| 8 |
The released checkpoint is `exp15_sft_qwen7b_4ep`.
|
|
@@ -27,6 +38,24 @@ The underlying synthetic conversation source lives in the upstream
|
|
| 27 |
The source generator was built to create long-horizon memory stress cases with
|
| 28 |
inserts, updates, deletes, and multi-session recall.
|
| 29 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 30 |
## Derived SFT Data
|
| 31 |
|
| 32 |
These are GPT-4.1-derived proposition labels built on top of the raw
|
|
@@ -51,6 +80,23 @@ The released model was evaluated on two held-out surfaces:
|
|
| 51 |
Both the GPT-4.1 extraction baseline and the released 7B extractor were scored
|
| 52 |
with the same GPT-4.1 QA evaluator and the same cache-backed answer surface.
|
| 53 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 54 |
## Auxiliary LoCoMo Datasets
|
| 55 |
|
| 56 |
These files were used in ablations and targeted probes. They matter for the
|
|
|
|
| 3 |
This file separates the data used by the public `PRISM-Memory` release from the
|
| 4 |
auxiliary datasets that were only useful for ablations.
|
| 5 |
|
| 6 |
+
## Data Provenance
|
| 7 |
+
|
| 8 |
+
The release training data is **synthetic**.
|
| 9 |
+
|
| 10 |
+
- The conversation source was programmatically generated to stress long-horizon
|
| 11 |
+
memory behavior such as inserts, updates, deletes, contradiction handling,
|
| 12 |
+
and multi-session recall.
|
| 13 |
+
- The SFT labels were then derived from those synthetic conversations with a
|
| 14 |
+
GPT-4.1 proposition extractor.
|
| 15 |
+
- No real end-user chat logs are part of this public release story.
|
| 16 |
+
|
| 17 |
## Released Training Recipe
|
| 18 |
|
| 19 |
The released checkpoint is `exp15_sft_qwen7b_4ep`.
|
|
|
|
| 38 |
The source generator was built to create long-horizon memory stress cases with
|
| 39 |
inserts, updates, deletes, and multi-session recall.
|
| 40 |
|
| 41 |
+
## Example Training Item
|
| 42 |
+
|
| 43 |
+
This is the shape of the data the model learned from: a synthetic dialogue turn
|
| 44 |
+
paired with proposition-style extraction targets.
|
| 45 |
+
|
| 46 |
+
**Synthetic turn**
|
| 47 |
+
|
| 48 |
+
> yeah, I think starting with incremental scans and parallel matrix jobs makes sense. We have 20 concurrent jobs max on GitHub Actions currently. Also want to keep Slack notifications from Snyk consistent with other pipeline alerts, aggregated and concise.
|
| 49 |
+
|
| 50 |
+
**Target propositions**
|
| 51 |
+
|
| 52 |
+
- GitHub Actions concurrency limit: 20 concurrent jobs
|
| 53 |
+
- Wants Snyk Slack notifications aggregated and concise, consistent with other pipeline alerts
|
| 54 |
+
|
| 55 |
+
This example is illustrative of the release data format. The exact public
|
| 56 |
+
release checkpoint was trained on the larger `train_sft.jsonl` corpus, not on
|
| 57 |
+
just this slice.
|
| 58 |
+
|
| 59 |
## Derived SFT Data
|
| 60 |
|
| 61 |
These are GPT-4.1-derived proposition labels built on top of the raw
|
|
|
|
| 80 |
Both the GPT-4.1 extraction baseline and the released 7B extractor were scored
|
| 81 |
with the same GPT-4.1 QA evaluator and the same cache-backed answer surface.
|
| 82 |
|
| 83 |
+
## What Is Public Right Now
|
| 84 |
+
|
| 85 |
+
Public now:
|
| 86 |
+
|
| 87 |
+
- dataset description and counts
|
| 88 |
+
- held-out extraction examples
|
| 89 |
+
- release metrics and benchmark breakdowns
|
| 90 |
+
|
| 91 |
+
Not public yet:
|
| 92 |
+
|
| 93 |
+
- the raw `train.jsonl` and `eval.jsonl` conversation files
|
| 94 |
+
- the full `train_sft.jsonl` and `train_sft_clean_merged.jsonl` label files
|
| 95 |
+
- the auxiliary LoCoMo ablation JSONLs
|
| 96 |
+
|
| 97 |
+
So the current release makes the **data recipe** public, but not the full raw
|
| 98 |
+
training corpora.
|
| 99 |
+
|
| 100 |
## Auxiliary LoCoMo Datasets
|
| 101 |
|
| 102 |
These files were used in ablations and targeted probes. They matter for the
|