Text Generation
PEFT
Safetensors
conversational-memory
information-extraction
long-context
lora
qwen2.5
conversational
Instructions to use AsadIsmail/prism-memory with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use AsadIsmail/prism-memory with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-7B-Instruct") model = PeftModel.from_pretrained(base_model, "AsadIsmail/prism-memory") - Notebooks
- Google Colab
- Kaggle
| # PRISM-Memory Release Results | |
| This page summarizes the confirmed public release metrics and the internal | |
| comparison evidence that informed the release choice. | |
| ## Released Model | |
| - Model: `PRISM-Memory 7B Adapter` | |
| - Base model: `Qwen/Qwen2.5-7B-Instruct` | |
| - Adapter type: LoRA | |
| - Confirmed LoCoMo mean: `0.4981204463` | |
| - Confirmed LongMemEval mean: `0.4767574431` | |
| - QA cache hits during confirmation: `460` | |
| - QA cache misses during confirmation: `0` | |
| ## Public Comparison | |
| PRISM-Memory fine-tunes `Qwen/Qwen2.5-7B-Instruct` for the memory extraction | |
| step that the PropMem reference gets from GPT-4.1. | |
| | Benchmark | PRISM-Memory | GPT-4.1-based PropMem reference | Read | | |
| |---|---:|---:|---| | |
| | LongMemEval | `0.4768` | `0.4650` | PRISM wins | | |
| | LoCoMo | `0.4981` | `0.5360` | PRISM trails, but stays competitive | | |
| The QA layer is held constant. This is an extraction-step comparison, not an | |
| end-to-end GPT-4.1 replacement claim. | |
| ## LoCoMo Breakdown | |
| | Category | Score | | |
| |---|---:| | |
| | factual | `0.3339551926` | | |
| | temporal | `0.4978785870` | | |
| | inferential | `0.2605997475` | | |
| | multi-hop | `0.5144477744` | | |
| | adversarial | `0.8837209302` | | |
| ## LongMemEval Breakdown | |
| | Category | Score | | |
| |---|---:| | |
| | knowledge-update | `0.5588405797` | | |
| | multi-session | `0.1390977444` | | |
| | single-session-assistant | `0.7656395892` | | |
| | single-session-preference | `0.0519667456` | | |
| | single-session-user | `0.9133333333` | | |
| | temporal-reasoning | `0.4316666667` | | |
| ## Why This Model Was Released | |
| The closest internal runner-up nearly tied the released model on overall | |
| LoCoMo, but it lost on the broader release profile: | |
| - lower LongMemEval score: `0.4689` | |
| - weaker adversarial precision | |
| - less balanced behavior across the full evaluation surface | |
| Question-level comparison on held-out LoCoMo: | |
| - disagreements: `152 / 400` | |
| - questions favoring PRISM-Memory: `56` | |
| - questions favoring the runner-up: `52` | |
| That is close enough to be a real internal comparison, but not close enough to | |
| justify two public models. | |
| ## Artifact Files | |
| - [../../results/release_summary.json](../../results/release_summary.json) | |
| - [../../results/release_model.json](../../results/release_model.json) | |
| - [../../results/try_it_sessions.json](../../results/try_it_sessions.json) | |
| - [../../results/internal_locomo_pairwise_diffs.json](../../results/internal_locomo_pairwise_diffs.json) | |
| Related docs: | |
| - [extraction-skill.md](extraction-skill.md) | |
| - [extraction-examples.md](extraction-examples.md) | |
| - [datasets.md](datasets.md) | |
| - [model-card.md](model-card.md) | |