AgentRank-Small: Embedding Model for AI Agent Memory Retrieval

AgentRank is the first embedding model family specifically designed for AI agent memory retrieval. Unlike general-purpose embedders, AgentRank understands temporal context, memory types, and importance - critical for agents that need to remember past interactions.

🚀 Key Results

Model	MRR	Recall@1	Recall@5	NDCG@10
AgentRank-Small	0.6375	0.4460	0.9740	0.6797
all-MiniLM-L6-v2	0.5297	0.3720	0.7520	0.6370
all-mpnet-base-v2	0.5351	0.3660	0.7960	0.6335

+20% MRR improvement over base MiniLM model!

🎯 Why AgentRank?

AI agents need memory that understands:

Challenge	General Embedders	AgentRank
"What did I say yesterday?"	❌ No temporal awareness	✅ Temporal embeddings
"What's my preference?"	❌ Mixes with events	✅ Memory type awareness
"What's most important?"	❌ No priority	✅ Importance prediction

📦 Installation

pip install transformers torch

💻 Usage

Basic Usage

from transformers import AutoModel, AutoTokenizer
import torch

# Load model
model = AutoModel.from_pretrained("vrushket/agentrank-small")
tokenizer = AutoTokenizer.from_pretrained("vrushket/agentrank-small")

def encode(texts):
    inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt")
    with torch.no_grad():
        outputs = model(**inputs)
        embeddings = outputs.last_hidden_state.mean(dim=1)
        embeddings = torch.nn.functional.normalize(embeddings, p=2, dim=1)
    return embeddings

# Encode memories and query
memories = [
    "User prefers Python over JavaScript",
    "User asked about machine learning yesterday",
    "User is working on a web project",
]
query = "What programming language does the user like?"

memory_embeddings = encode(memories)
query_embedding = encode([query])

# Compute similarities
similarities = torch.mm(query_embedding, memory_embeddings.T)
print(f"Most relevant: {memories[similarities.argmax()]}")
# Output: "User prefers Python over JavaScript"

With Temporal & Memory Type Metadata (Full Power)

# For full AgentRank features including temporal awareness:
# pip install agentrank  (coming soon!)

from agentrank import AgentRankEmbedder

model = AgentRankEmbedder.from_pretrained("vrushket/agentrank-small")

# Encode with metadata
embedding = model.encode(
    "User mentioned they prefer morning meetings",
    days_ago=3,           # Memory is 3 days old
    memory_type="semantic" # It's a preference, not an event
)

🏗️ Architecture

AgentRank-Small is based on all-MiniLM-L6-v2 with novel additions:

┌─────────────────────────────────────────┐
│  MiniLM Transformer Encoder (6 layers)  │
└─────────────────────────────────────────┘
                    │
    ┌───────────────┼───────────────┐
    ↓               ↓               ↓
┌─────────┐   ┌──────────┐   ┌───────────┐
│ Temporal │   │ Memory   │   │ Importance│
│ Position │   │ Type     │   │ Prediction│
│ Embed    │   │ Embed    │   │ Head      │
└─────────┘   └──────────┘   └───────────┘
    │               │               │
    └───────────────┼───────────────┘
                    ↓
         ┌─────────────────┐
         │ L2 Normalized   │
         │ 384-dim Embedding│
         └─────────────────┘

Novel Features:

Temporal Position Embeddings: 10 learnable buckets (today, 1-3 days, week, month, etc.)
Memory Type Embeddings: Episodic, Semantic, Procedural
Importance Prediction Head: Auxiliary task during training

🎓 Training

Dataset: 500K synthetic agent memory samples
Memory Types: Episodic (40%), Semantic (35%), Procedural (25%)
Loss: Multiple Negatives Ranking Loss + Importance MSE
Hard Negatives: 5 types (temporal, type confusion, topic drift, etc.)
Hardware: NVIDIA RTX 6000 Ada (48GB) with FP16

📊 Benchmarks

Evaluated on AgentMemBench (500 test samples, 8 candidates each):

Metric	AgentRank-Small	MiniLM	Improvement
MRR	0.6375	0.5297	+20.4%
Recall@1	0.4460	0.3720	+19.9%
Recall@5	0.9740	0.7520	+29.5%
NDCG@10	0.6797	0.6370	+6.7%

🔜 Coming Soon

AgentRank-Base: 110M params, even better performance
AgentRank-Reranker: Cross-encoder for top-k refinement
Python Package: pip install agentrank

📚 Citation

@misc{agentrank2024,
  author = {Vrushket More},
  title = {AgentRank: Embedding Models for AI Agent Memory Retrieval},
  year = {2024},
  publisher = {HuggingFace},
  url = {https://huggingface.co/vrushket/agentrank-small}
}

📄 License

Apache 2.0 - Free for commercial use!

🤝 Acknowledgments

Built on top of sentence-transformers and MiniLM.

Downloads last month: 7

Safetensors

Model size

22.7M params

Tensor type

F32

Evaluation results

MRR
self-reported

0.637
Recall@1
self-reported

0.446
Recall@5
self-reported

0.974
NDCG@10
self-reported

0.680