CEM888 MemoryAgentBench results: 99.9% AR, 77.2% BEAM

9b209aa verified 18 days ago

1.54 kB

license: mit
task_categories:
  - question-answering
  - text-retrieval
tags:
  - memory
  - benchmark
  - agent
  - retrieval
  - local-ai
  - sovereign-ai
pretty_name: CEM888 MemoryAgentBench Results
size_categories:
  - n<1K

CEM888.AI MemoryAgentBench Results

99.9% AR · 77.2% BEAM — Filesystem-native memory agent on MemoryAgentBench (ICLR 2026).

Scores

Benchmark	CEM888 (Vetta)	Best Published
AR Retrieval	99.9%	71.5% (Hindsight)
BEAM Memory	77.2%	64.1% (Hindsight honest)

AR: 2,000 retrieval questions — 2 misses out of 2,000
BEAM: 200 multi-category memory questions

Architecture

Model: DeepSeek V4 Pro
Retrieval: Filesystem-first, deterministic search — no RAG, no embeddings, no vector DB
Memory: Agent-native sovereign vault — the filesystem is ground truth
Deployment: Fully local. No cloud. No data leakage.

AR-Results-99.9pct.md — Full AR breakdown with all categories
Vetta-BEAM-Honest-77.2pct.md — BEAM methodology and per-category scores
vetta_beam_v9_results.jsonl — All 200 BEAM questions with scores
vetta_live_results.jsonl — All 2,000 AR questions with scores

CEM888AI
/

cem888-benchmarks

CEM888.AI MemoryAgentBench Results

Scores

Architecture

Contents

Links