Text Generation
PEFT
Safetensors
conversational-memory
information-extraction
long-context
lora
qwen2.5
conversational
Instructions to use AsadIsmail/prism-memory with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use AsadIsmail/prism-memory with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-7B-Instruct") model = PeftModel.from_pretrained(base_model, "AsadIsmail/prism-memory") - Notebooks
- Google Colab
- Kaggle
File size: 4,465 Bytes
7f94cde 9088f51 7f94cde 9088f51 7f94cde 9088f51 7f94cde 9088f51 7f94cde 9088f51 7f94cde 9088f51 7f94cde 9088f51 7f94cde 9088f51 7f94cde 9088f51 7f94cde 9088f51 7f94cde 9088f51 7f94cde 9088f51 7f94cde 9088f51 7f94cde 9088f51 7f94cde 9088f51 7f94cde 9088f51 7f94cde 9088f51 7f94cde 9088f51 7f94cde 9088f51 7f94cde 9088f51 7f94cde 9088f51 7f94cde 9088f51 7f94cde 9088f51 7f94cde 9088f51 7f94cde 9088f51 7f94cde 9088f51 7f94cde 9088f51 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 | # PRISM-Memory End-To-End Scenarios
These are compact product-style scenarios built from the public release
artifacts.
- The first three come from the bundled interactive session artifact
[../../results/try_it_sessions.json](../../results/try_it_sessions.json).
- The last one comes from the held-out extraction examples in
[../../results/extraction_examples.json](../../results/extraction_examples.json).
The point is not just that the extractor matches GPT-4.1-style labels. The
point is that a later system can ask a concrete question and get back a useful,
inspectable answer from stored memory.
## 1. Keep hard limits and notification preferences
**Conversation turns**
> [2025-03-01] Dana: We have 20 concurrent jobs max on GitHub Actions right now. Keep Snyk Slack notifications aggregated and concise, not one alert per repo.
> [2025-03-07] Dana: No mTLS yet. Put it in phase two after the canary rollout.
**Stored memory**
- GitHub Actions concurrency limit: 20 concurrent jobs
- Snyk Slack notifications aggregated and concise; no separate alerts per repo
- No mutual TLS enabled; mutual TLS will be implemented in phase two after the canary rollout
**Later question**
What is the current concurrency limit, how should Slack alerts behave, and when
is mTLS planned?
**Answer from memory**
The current GitHub Actions concurrency limit is 20 jobs. Slack notifications
should be aggregated and concise. mTLS is planned for phase two after the
canary rollout.
**Why it matters**
This is the kind of operational detail that gets buried in chat but needs to
survive into later workflow drafts and agent actions.
## 2. Replace stale travel plans with the current one
**Conversation turns**
> [2025-06-12] Maya: We booked the Lisbon trip for September 14. I want a quiet hotel near Alfama, and no red-eye flights.
> [2025-07-02] Maya: Update the plan: Lisbon is off. We are going to Porto on September 21 instead, still no red-eye flights.
**Stored memory**
- Lisbon trip booked for September 14
- Prefers a quiet hotel near Alfama
- Prefers no red-eye flights
- Lisbon trip canceled; Porto trip scheduled for September 21
- No red-eye flights for Porto trip
**Later question**
Where is the trip now, on what date, and what flight constraint still applies?
**Answer from memory**
The current trip is to Porto on September 21, 2025, and red-eye flights are
still off limits.
**Why it matters**
Memory systems often keep both the old and new plan, but the downstream system
still has to recover the latest valid state.
## 3. Keep routines and health constraints available later
**Conversation turns**
> [2025-08-03] Sam: My doctor wants me to keep sodium under 2 grams a day. I started painting on weekends because it helps me decompress.
> [2025-08-17] Sam: I bought watercolors and signed up for a Saturday class downtown.
> [2025-09-01] Sam: Skip late-night coffee from now on; it wrecks my sleep.
**Stored memory**
- Doctor recommended Sam limit sodium intake to 2 grams per day
- Sam paints on weekends as a decompression activity
- Sam bought watercolors and signed up for a Saturday painting class downtown
- Sam decided to skip late-night coffee starting now
**Later question**
What hobby did Sam start, what class did he sign up for, and what health
constraints matter now?
**Answer from memory**
Sam started painting, signed up for a Saturday watercolor class downtown,
should keep sodium under 2 grams per day, and wants to avoid late-night coffee.
**Why it matters**
This is the difference between transcript search and usable memory. The durable
facts can drive reminders, planning, or coaching later.
## 4. Keep infrastructure bottlenecks structured
**Conversation turn**
> yeah, no real caching beyond basic Docker layer caching. Jenkins nodes have limited capacity, and we sometimes hit queue delays during peak commits.
**Stored memory**
- No Docker caching beyond basic layer caching
- Jenkins nodes have limited capacity; peak commits cause queue delays
**Later question**
What is our current caching setup, and why do builds sometimes queue up?
**Answer from memory**
There is no special caching beyond basic Docker layer caching. Builds queue up
because Jenkins nodes have limited capacity during peak commits.
**Why it matters**
The stored memory stays compact and directly useful. A later system does not
need to reread the full conversation turn to answer the operational question.
|