all-MiniLLM-VividTuned

A purpose-built fine-tuned embedding model for emotion-aware memory retrieval in AI companions. Based on all-MiniLM-L6-v2, fine-tuned with 10 specialised objectives to natively encode emotion, importance, narrative arcs, and emotional transitions into the embedding space.

GitHub: Kronic90/VividEmbed PyPI: pip install vividembed

What This Model Does

Standard sentence embedding models produce vectors that capture what was said but nothing about how it felt. This model learns to differentiate memories by emotional context, not just semantic content.

It does this through 58 special tokens added during fine-tuning:

[EMO:happy] [IMP:8] [ARC:climax] [FROM:anxious] I finally got the promotion!

Token Type	Examples	Purpose
`[EMO:x]`	`[EMO:happy]`, `[EMO:anxious]`	Encode the emotion of the memory
`[IMP:x]`	`[IMP:1]` through `[IMP:10]`	Encode importance (1-10 scale)
`[ARC:x]`	`[ARC:setup]`, `[ARC:climax]`	Encode narrative arc position
`[FROM:x]`	`[FROM:calm]`, `[FROM:sad]`	Encode emotional transition (previous state)
`[MOOD:x]`	`[MOOD:happy]`	Encode query-time mood for retrieval
`[QUERY]`	`[QUERY]`	Query-mode flag

This means a memory about "I got the promotion" encoded with [EMO:happy] produces a different vector than the same text encoded with [EMO:anxious] — and the model has learned what that emotional difference means geometrically.

How It Compares to Standard Embeddings

On the MemGPT/Letta EmbedBench benchmark (500 evaluations, 5 seeds):

Metric	Standard all-MiniLM-L6-v2	all-MiniLLM-VividTuned	Leading Memory System
Tool Accuracy	0.4320	0.4400 (+1.9%)	0.4300
F1 Score	0.5148	0.5151 (+0.1%)	0.4945
BLEU-1	0.6338	0.6660 (+5.1%)	0.6310

The fine-tuned model outperforms both the base model it was built on and leading memory systems — with the same 22M parameters and no cloud APIs.

What the Numbers Mean

Tool Accuracy: How often the system uses the right memory tool for the task
F1 Score: Precision-recall balance on memory retrieval relevance
BLEU-1: How well retrieved memories match expected content (unigram overlap)

Model Details

Property	Value
Base model	all-MiniLM-L6-v2
Parameters	22M
Output dimension	384
Max sequence length	256 tokens
Special tokens added	58
Training objectives	10
Training examples	~35,000
Final training loss	0.0208
Format	SafeTensors

Training Objectives

The model was trained on 10 simultaneous objectives:

Emotion clustering — same-emotion memories cluster together
Cross-emotion separation — different-emotion memories separate
Semantic similarity — semantically similar content stays close
Importance ordering — higher importance → larger vector magnitude
Mood-congruent retrieval — queries match mood-aligned memories
Emotional transition encoding — [FROM:x] tokens differentiate transition context
Narrative arc structure — arc positions produce distinct vectors
Pattern separation — near-duplicate memories are de-correlated
Contradiction detection — contradicting memories have low similarity
Entity grounding — entity-specific facts remain retrievable

Key Design Choice: Vector Magnitude Encodes Importance

Unlike the base model which L2-normalises everything, this model preserves vector magnitude. High-importance memories ([IMP:9]) produce longer vectors than low-importance ones ([IMP:2]). The VividEmbed scoring function uses this:

viv_signal = vector_norm / 5.0  # normalised to baseline MiniLM magnitude
score = cos_sim × (0.7 + 0.3 × viv_signal × decay) + 0.1 × recency

Usage

With VividEmbed (Recommended)

pip install vividembed

from vividembed import VividEmbed

# Automatically uses VividTuned if found, falls back to base model
ve = VividEmbed(model_name="Kronic90/all-MiniLLM-VividTuned")

ve.add("I finally got the job offer!", emotion="excited", importance=9)
ve.add("The rejection letter was devastating", emotion="sad", importance=8)

# Mood-congruent retrieval
results = ve.query("career milestones", mood="happy", top_k=3)

With sentence-transformers Directly

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("Kronic90/all-MiniLLM-VividTuned")

# Encode with emotion tokens
vec1 = model.encode("[EMO:happy] [IMP:8] I got the promotion!")
vec2 = model.encode("[EMO:sad] [IMP:8] I got the promotion!")

# These produce DIFFERENT vectors — the model understands emotional context
import numpy as np
cos_sim = np.dot(vec1, vec2) / (np.linalg.norm(vec1) * np.linalg.norm(vec2))
print(f"Same text, different emotion: similarity = {cos_sim:.3f}")
# Expect < 1.0 — emotion shifts the embedding

Checking If Tokens Are Active

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("Kronic90/all-MiniLLM-VividTuned")
tokenizer = model.tokenizer

# Verify emotion tokens are in vocabulary
has_vivid = "[EMO:happy]" in tokenizer.get_vocab()
print(f"VividTuned tokens active: {has_vivid}")  # True

Neuroscience Foundations

This model implements concepts from cognitive neuroscience:

PAD Emotion Model (Mehrabian, 1996) — 76 emotions mapped to Pleasure-Arousal-Dominance space
Memory Reconsolidation (Nader et al., 2000) — recalled memories subtly shift toward retrieval context
Hippocampal Pattern Separation — near-duplicate memories are actively de-correlated
Narrative Arc Theory — episodic memories organised along 5-act story structures
Affect-as-Information — emotional state during encoding influences retrieval

See the VividEmbed GitHub repo for full architectural documentation, 190/190 passing tests, and 17 visual reports proving each mechanism works.

Checkpoints

This repository contains two checkpoints:

Checkpoint	Description
`best/`	Best validation loss during training (recommended)
`final/`	End-of-training checkpoint

Use best/ for production. Use final/ if you want to continue fine-tuning.

License

PolyForm Noncommercial 1.0.0 — free for personal, research, educational, and non-profit use. Commercial use requires a separate agreement.

Contact: @Kronic90

Citation

@software{vividembed2026,
  title   = {VividEmbed: Neuroscience-Inspired Memory Embeddings for AI Companions},
  author  = {Kronic90},
  year    = {2026},
  url     = {https://github.com/Kronic90/VividEmbed}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for Kronic90/all-MiniLLM-VividTuned

Base model

nreimers/MiniLM-L6-H384-uncased

Quantized

sentence-transformers/all-MiniLM-L6-v2

Finetuned

(944)

this model

Evaluation results

Tool Accuracy
self-reported

0.440
F1 Score
self-reported

0.515
BLEU-1
self-reported

0.666