# PRISM-Memory Release Results This page summarizes the confirmed public release metrics and the internal comparison evidence that informed the release choice. ## Released Model - Model: `PRISM-Memory 7B Adapter` - Base model: `Qwen/Qwen2.5-7B-Instruct` - Adapter type: LoRA - Confirmed LoCoMo mean: `0.4981204463` - Confirmed LongMemEval mean: `0.4767574431` - QA cache hits during confirmation: `460` - QA cache misses during confirmation: `0` ## Public Comparison PRISM-Memory fine-tunes `Qwen/Qwen2.5-7B-Instruct` for the memory extraction step that the PropMem reference gets from GPT-4.1. | Benchmark | PRISM-Memory | GPT-4.1-based PropMem reference | Read | |---|---:|---:|---| | LongMemEval | `0.4768` | `0.4650` | PRISM wins | | LoCoMo | `0.4981` | `0.5360` | PRISM trails, but stays competitive | The QA layer is held constant. This is an extraction-step comparison, not an end-to-end GPT-4.1 replacement claim. ## LoCoMo Breakdown | Category | Score | |---|---:| | factual | `0.3339551926` | | temporal | `0.4978785870` | | inferential | `0.2605997475` | | multi-hop | `0.5144477744` | | adversarial | `0.8837209302` | ## LongMemEval Breakdown | Category | Score | |---|---:| | knowledge-update | `0.5588405797` | | multi-session | `0.1390977444` | | single-session-assistant | `0.7656395892` | | single-session-preference | `0.0519667456` | | single-session-user | `0.9133333333` | | temporal-reasoning | `0.4316666667` | ## Why This Model Was Released The closest internal runner-up nearly tied the released model on overall LoCoMo, but it lost on the broader release profile: - lower LongMemEval score: `0.4689` - weaker adversarial precision - less balanced behavior across the full evaluation surface Question-level comparison on held-out LoCoMo: - disagreements: `152 / 400` - questions favoring PRISM-Memory: `56` - questions favoring the runner-up: `52` That is close enough to be a real internal comparison, but not close enough to justify two public models. ## Artifact Files - [../../results/release_summary.json](../../results/release_summary.json) - [../../results/release_model.json](../../results/release_model.json) - [../../results/try_it_sessions.json](../../results/try_it_sessions.json) - [../../results/internal_locomo_pairwise_diffs.json](../../results/internal_locomo_pairwise_diffs.json) Related docs: - [extraction-skill.md](extraction-skill.md) - [extraction-examples.md](extraction-examples.md) - [datasets.md](datasets.md) - [model-card.md](model-card.md)