|
|
---
|
|
|
license: apache-2.0
|
|
|
language:
|
|
|
- en
|
|
|
tags:
|
|
|
- sentence-transformers
|
|
|
- embeddings
|
|
|
- retrieval
|
|
|
- agents
|
|
|
- memory
|
|
|
- rag
|
|
|
- semantic-search
|
|
|
library_name: transformers
|
|
|
pipeline_tag: sentence-similarity
|
|
|
datasets:
|
|
|
- custom
|
|
|
metrics:
|
|
|
- mrr
|
|
|
- recall
|
|
|
- ndcg
|
|
|
model-index:
|
|
|
- name: agentrank-small
|
|
|
results:
|
|
|
- task:
|
|
|
type: retrieval
|
|
|
name: Agent Memory Retrieval
|
|
|
metrics:
|
|
|
- type: mrr
|
|
|
value: 0.6375
|
|
|
name: MRR
|
|
|
- type: recall
|
|
|
value: 0.4460
|
|
|
name: Recall@1
|
|
|
- type: recall
|
|
|
value: 0.9740
|
|
|
name: Recall@5
|
|
|
- type: ndcg
|
|
|
value: 0.6797
|
|
|
name: NDCG@10
|
|
|
---
|
|
|
|
|
|
# AgentRank-Small: Embedding Model for AI Agent Memory Retrieval
|
|
|
|
|
|
<p align="center">
|
|
|
<img src="https://img.shields.io/badge/MRR-0.6375-brightgreen" alt="MRR">
|
|
|
<img src="https://img.shields.io/badge/Recall%405-97.4%25-blue" alt="Recall@5">
|
|
|
<img src="https://img.shields.io/badge/Parameters-33M-orange" alt="Parameters">
|
|
|
<img src="https://img.shields.io/badge/License-Apache%202.0-green" alt="License">
|
|
|
</p>
|
|
|
|
|
|
**AgentRank** is the first embedding model family specifically designed for AI agent memory retrieval. Unlike general-purpose embedders, AgentRank understands temporal context, memory types, and importance - critical for agents that need to remember past interactions.
|
|
|
|
|
|
## π Key Results
|
|
|
|
|
|
| Model | MRR | Recall@1 | Recall@5 | NDCG@10 |
|
|
|
|-------|-----|----------|----------|---------|
|
|
|
| **AgentRank-Small** | **0.6375** | **0.4460** | **0.9740** | **0.6797** |
|
|
|
| all-MiniLM-L6-v2 | 0.5297 | 0.3720 | 0.7520 | 0.6370 |
|
|
|
| all-mpnet-base-v2 | 0.5351 | 0.3660 | 0.7960 | 0.6335 |
|
|
|
|
|
|
**+20% MRR improvement over base MiniLM model!**
|
|
|
|
|
|
## π― Why AgentRank?
|
|
|
|
|
|
AI agents need memory that understands:
|
|
|
|
|
|
| Challenge | General Embedders | AgentRank |
|
|
|
|-----------|-------------------|-----------|
|
|
|
| "What did I say **yesterday**?" | β No temporal awareness | β
Temporal embeddings |
|
|
|
| "What's my **preference**?" | β Mixes with events | β
Memory type awareness |
|
|
|
| "What's **most important**?" | β No priority | β
Importance prediction |
|
|
|
|
|
|
## π¦ Installation
|
|
|
|
|
|
```bash
|
|
|
pip install transformers torch
|
|
|
```
|
|
|
|
|
|
## π» Usage
|
|
|
|
|
|
### Basic Usage
|
|
|
|
|
|
```python
|
|
|
from transformers import AutoModel, AutoTokenizer
|
|
|
import torch
|
|
|
|
|
|
# Load model
|
|
|
model = AutoModel.from_pretrained("vrushket/agentrank-small")
|
|
|
tokenizer = AutoTokenizer.from_pretrained("vrushket/agentrank-small")
|
|
|
|
|
|
def encode(texts):
|
|
|
inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt")
|
|
|
with torch.no_grad():
|
|
|
outputs = model(**inputs)
|
|
|
embeddings = outputs.last_hidden_state.mean(dim=1)
|
|
|
embeddings = torch.nn.functional.normalize(embeddings, p=2, dim=1)
|
|
|
return embeddings
|
|
|
|
|
|
# Encode memories and query
|
|
|
memories = [
|
|
|
"User prefers Python over JavaScript",
|
|
|
"User asked about machine learning yesterday",
|
|
|
"User is working on a web project",
|
|
|
]
|
|
|
query = "What programming language does the user like?"
|
|
|
|
|
|
memory_embeddings = encode(memories)
|
|
|
query_embedding = encode([query])
|
|
|
|
|
|
# Compute similarities
|
|
|
similarities = torch.mm(query_embedding, memory_embeddings.T)
|
|
|
print(f"Most relevant: {memories[similarities.argmax()]}")
|
|
|
# Output: "User prefers Python over JavaScript"
|
|
|
```
|
|
|
|
|
|
### With Temporal & Memory Type Metadata (Full Power)
|
|
|
|
|
|
```python
|
|
|
# For full AgentRank features including temporal awareness:
|
|
|
# pip install agentrank (coming soon!)
|
|
|
|
|
|
from agentrank import AgentRankEmbedder
|
|
|
|
|
|
model = AgentRankEmbedder.from_pretrained("vrushket/agentrank-small")
|
|
|
|
|
|
# Encode with metadata
|
|
|
embedding = model.encode(
|
|
|
"User mentioned they prefer morning meetings",
|
|
|
days_ago=3, # Memory is 3 days old
|
|
|
memory_type="semantic" # It's a preference, not an event
|
|
|
)
|
|
|
```
|
|
|
|
|
|
## ποΈ Architecture
|
|
|
|
|
|
AgentRank-Small is based on `all-MiniLM-L6-v2` with novel additions:
|
|
|
|
|
|
```
|
|
|
βββββββββββββββββββββββββββββββββββββββββββ
|
|
|
β MiniLM Transformer Encoder (6 layers) β
|
|
|
βββββββββββββββββββββββββββββββββββββββββββ
|
|
|
β
|
|
|
βββββββββββββββββΌββββββββββββββββ
|
|
|
β β β
|
|
|
βββββββββββ ββββββββββββ βββββββββββββ
|
|
|
β Temporal β β Memory β β Importanceβ
|
|
|
β Position β β Type β β Predictionβ
|
|
|
β Embed β β Embed β β Head β
|
|
|
βββββββββββ ββββββββββββ βββββββββββββ
|
|
|
β β β
|
|
|
βββββββββββββββββΌββββββββββββββββ
|
|
|
β
|
|
|
βββββββββββββββββββ
|
|
|
β L2 Normalized β
|
|
|
β 384-dim Embeddingβ
|
|
|
βββββββββββββββββββ
|
|
|
```
|
|
|
|
|
|
**Novel Features:**
|
|
|
- **Temporal Position Embeddings**: 10 learnable buckets (today, 1-3 days, week, month, etc.)
|
|
|
- **Memory Type Embeddings**: Episodic, Semantic, Procedural
|
|
|
- **Importance Prediction Head**: Auxiliary task during training
|
|
|
|
|
|
## π Training
|
|
|
|
|
|
- **Dataset**: 500K synthetic agent memory samples
|
|
|
- **Memory Types**: Episodic (40%), Semantic (35%), Procedural (25%)
|
|
|
- **Loss**: Multiple Negatives Ranking Loss + Importance MSE
|
|
|
- **Hard Negatives**: 5 types (temporal, type confusion, topic drift, etc.)
|
|
|
- **Hardware**: NVIDIA RTX 6000 Ada (48GB) with FP16
|
|
|
|
|
|
## π Benchmarks
|
|
|
|
|
|
Evaluated on AgentMemBench (500 test samples, 8 candidates each):
|
|
|
|
|
|
| Metric | AgentRank-Small | MiniLM | Improvement |
|
|
|
|--------|-----------------|--------|-------------|
|
|
|
| MRR | 0.6375 | 0.5297 | **+20.4%** |
|
|
|
| Recall@1 | 0.4460 | 0.3720 | **+19.9%** |
|
|
|
| Recall@5 | 0.9740 | 0.7520 | **+29.5%** |
|
|
|
| NDCG@10 | 0.6797 | 0.6370 | **+6.7%** |
|
|
|
|
|
|
## π Coming Soon
|
|
|
|
|
|
- **AgentRank-Base**: 110M params, even better performance
|
|
|
- **AgentRank-Reranker**: Cross-encoder for top-k refinement
|
|
|
- **Python Package**: `pip install agentrank`
|
|
|
|
|
|
## π Citation
|
|
|
|
|
|
```bibtex
|
|
|
@misc{agentrank2024,
|
|
|
author = {Vrushket More},
|
|
|
title = {AgentRank: Embedding Models for AI Agent Memory Retrieval},
|
|
|
year = {2024},
|
|
|
publisher = {HuggingFace},
|
|
|
url = {https://huggingface.co/vrushket/agentrank-small}
|
|
|
}
|
|
|
```
|
|
|
|
|
|
## π License
|
|
|
|
|
|
Apache 2.0 - Free for commercial use!
|
|
|
|
|
|
## π€ Acknowledgments
|
|
|
|
|
|
Built on top of [sentence-transformers](https://www.sbert.net/) and [MiniLM](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2).
|
|
|
|