experience-extractor-350m-v1 (MLX 4-bit)

A small, on-device structured fact extractor for memory engines, fine-tuned from LiquidAI/LFM2.5-350M (full fine-tune (mlx-lm)). It reads a chat transcript and emits every storable fact as JSON in a fixed 8-field schema:

{"facts": [
  {"what": "...", "when": null, "where": null, "why": null,
   "who": ["..."], "fact_type": "world|experience",
   "entities": ["..."], "message_refs": ["id:m07"]}
]}

It powers the experience memory engine (EXPERIENCE_EXTRACTOR=lfm25). This repo holds the MLX 4-bit build for fast inference on Apple Silicon via Ollama's MLX engine or mlx-lm.

Evaluation (LongMemEval-cleaned "KU", content-recall)

Run it windowed. Whole-transcript extraction caps a small model near 0.62; sliding a 5-message window and unioning the per-window facts is the recall mechanism and the recommended deploy mode. Pairing the 350M + 1.2B as an ensemble reaches ~0.986 on KU.

mode recall mean facts/row repeat
5-msg windowed (recommended) 0.958 ~45 high (use dedup)
5-msg windowed + semantic dedup@0.6 0.903 ~15 ~0.16 (clean)
whole-transcript (single pass) 0.611 — free decoding understates; use windowing low low

This MLX 4-bit build, measured: windowed 0.958 (= the GGUF), dedup@0.6 0.903. The low whole-transcript 0.611 is a free-decoding artifact, not quant loss — windowing recovers it.

Files

  • MLX 4-bit model (config.json, model.safetensors, tokenizer, chat template); 211 MB.

Usage

Ollama (MLX engine, Apple Silicon):

ollama run hf.co/mindi-dev/experience-extractor-350m-v1-mlx-4bit

mlx-lm: mlx_lm.generate --model mindi-dev/experience-extractor-350m-v1-mlx-4bit --prompt "<transcript>"

For the recall numbers above, drive it windowed (5-msg sliding window + union + dedup) — e.g. via the experience crate's EXPERIENCE_EXTRACTION_WINDOW=5. A single whole-transcript pass under free decoding scores lower.

Other formats

Training

Full pipeline at mindi-dev/experience (training/). Fine-tuned on real-distribution LongMemEval transcripts (leakage-safe; held-out KU never trained on) with grounded teacher-generated labels.

License

Fine-tune of LiquidAI/LFM2.5-350M under the LFM Open License v1.0. Redistribution permitted with attribution + change notice; commercial use by entities with ≥ US$10M revenue requires a Liquid AI commercial license (Sec. 5). The crate code is MIT and separate. See NOTICE.md and the full LICENSE in this repo.

Downloads last month
23
Safetensors
Model size
55.4M params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mindi-dev/experience-extractor-350m-v1-mlx-4bit

Quantized
(38)
this model