File size: 12,403 Bytes
a4ad2f1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a239dd5
a4ad2f1
a239dd5
a4ad2f1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
---

license: apache-2.0
language:
- en
tags:
- sentence-transformers
- embeddings
- retrieval
- agents
- memory
- rag
- semantic-search
- ai-agents
- llm-memory
- vector-search
library_name: transformers
pipeline_tag: sentence-similarity
datasets:
- custom
metrics:
- mrr
- recall
- ndcg
model-index:
- name: agentrank-base
  results:
  - task:
      type: retrieval
      name: Agent Memory Retrieval
    metrics:
    - type: mrr
      value: 0.6496
      name: MRR
    - type: recall
      value: 0.4440
      name: Recall@1
    - type: recall
      value: 0.9960
      name: Recall@5
    - type: ndcg
      value: 0.6786
      name: NDCG@10
---


<div align="center">

# 🧠 AgentRank-Base

### The First Embedding Model Built Specifically for AI Agent Memory Retrieval

<p>
  <img src="https://img.shields.io/badge/MRR-0.65-brightgreen?style=for-the-badge" alt="MRR">
  <img src="https://img.shields.io/badge/Recall%405-99.6%25-blue?style=for-the-badge" alt="Recall@5">
  <img src="https://img.shields.io/badge/Parameters-149M-orange?style=for-the-badge" alt="Parameters">
  <img src="https://img.shields.io/badge/License-Apache%202.0-green?style=for-the-badge" alt="License">
</p>

**+23% MRR improvement over general-purpose embedders** | **Temporal awareness** | **Memory type understanding**

[πŸš€ Quick Start](#-quick-start) β€’ [πŸ“Š Benchmarks](#-benchmarks) β€’ [πŸ”§ Architecture](#-architecture) β€’ [πŸ’‘ Why AgentRank?](#-why-agentrank)

</div>

---

## 🎯 TL;DR

> **AgentRank-Base** is an embedding model designed for AI agents that need to remember. Unlike generic embedders (OpenAI, Cohere, MiniLM), AgentRank understands:
> - ⏰ **When** something happened (temporal awareness)
> - πŸ“ **What type** of memory it is (episodic vs semantic vs procedural)
> - ⭐ **How important** the memory is

---

## πŸ’‘ Why AgentRank?

### The Problem with Current Embedders

AI agents need memory. But when you ask an agent:

> *"What did we discuss about Python **yesterday**?"*

Current embedders fail because they:
- ❌ Don't understand "yesterday" means recent time
- ❌ Can't distinguish between a preference and an event
- ❌ Treat all memories as equally important

### The AgentRank Solution

| Challenge | OpenAI/Cohere/MiniLM | AgentRank |
|-----------|---------------------|-----------|
| "What did I say **yesterday**?" | Random old results πŸ˜• | Recent memories first βœ… |
| "What's my **preference**?" | Mixed with events πŸ˜• | Only preferences βœ… |
| "What's **most important**?" | No priority πŸ˜• | Importance-aware retrieval βœ… |

---

## πŸ“Š Benchmarks

Evaluated on **AgentMemBench** (500 test samples, 8 candidates each):

| Model | Parameters | MRR ↑ | Recall@1 ↑ | Recall@5 ↑ | NDCG@10 ↑ |
|-------|------------|-------|------------|------------|-----------|
| **AgentRank-Base** | 149M | **0.6496** | **0.4440** | **0.9960** | **0.6786** |
| AgentRank-Small | 33M | 0.6375 | 0.4460 | 0.9740 | 0.6797 |
| all-mpnet-base-v2 | 109M | 0.5351 | 0.3660 | 0.7960 | 0.6335 |
| all-MiniLM-L6-v2 | 22M | 0.5297 | 0.3720 | 0.7520 | 0.6370 |

### Improvement Over Baselines

| vs Baseline | MRR | Recall@1 | Recall@5 |
|-------------|-----|----------|----------|
| vs MiniLM | **+22.6%** | **+19.4%** | **+32.4%** |
| vs MPNet | **+21.4%** | **+21.3%** | **+25.1%** |

---

## πŸš€ Quick Start

### Installation

```bash

pip install transformers torch

```

### Basic Usage

```python

from transformers import AutoModel, AutoTokenizer

import torch



# Load model and tokenizer

model = AutoModel.from_pretrained("vrushket/agentrank-base")

tokenizer = AutoTokenizer.from_pretrained("vrushket/agentrank-base")



def encode(texts, model, tokenizer):

    """Encode texts to embeddings."""

    inputs = tokenizer(

        texts, 

        padding=True, 

        truncation=True, 

        max_length=512,

        return_tensors="pt"

    )

    with torch.no_grad():

        outputs = model(**inputs)

        # Mean pooling

        embeddings = outputs.last_hidden_state.mean(dim=1)

        # L2 normalize

        embeddings = torch.nn.functional.normalize(embeddings, p=2, dim=1)

    return embeddings



# Your agent's memories

memories = [

    "User prefers Python over JavaScript for backend development",

    "User asked about React frameworks yesterday",

    "User mentioned they have 3 years of coding experience",

    "User is working on an e-commerce project",

]



# A query from the user

query = "What programming language does the user prefer?"



# Encode everything

memory_embeddings = encode(memories, model, tokenizer)

query_embedding = encode([query], model, tokenizer)



# Find most similar memory

similarities = torch.mm(query_embedding, memory_embeddings.T)[0]

best_match_idx = similarities.argmax().item()



print(f"Query: {query}")

print(f"Best match: {memories[best_match_idx]}")

print(f"Similarity: {similarities[best_match_idx]:.4f}")



# Output:

# Query: What programming language does the user prefer?

# Best match: User prefers Python over JavaScript for backend development

# Similarity: 0.8234

```

### Advanced Usage with Metadata

For full temporal and memory type awareness, use the AgentRank package:

```python

# Coming soon: pip install agentrank

from agentrank import AgentRankEmbedder



model = AgentRankEmbedder.from_pretrained("vrushket/agentrank-base")



# Encode with temporal context

memory_embedding = model.encode(

    text="User mentioned they prefer morning meetings",

    days_ago=7,           # Memory is 1 week old

    memory_type="semantic" # It's a preference (not an event)

)



# Encode query (no metadata needed for queries)

query_embedding = model.encode("When does the user like to have meetings?")



# The model now knows this is a week-old preference!

similarity = torch.cosine_similarity(query_embedding, memory_embedding, dim=0)

```

---

## πŸ”§ Architecture

AgentRank-Base is built on **ModernBERT-base** (110M params) with novel additions:

```

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”

β”‚     ModernBERT Encoder (22 Transformer Layers)  β”‚

β”‚     - RoPE Positional Encoding                  β”‚

β”‚     - Flash Attention                           β”‚

β”‚     - 768 Hidden Dimension                      β”‚

β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

                       β”‚

       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”

       ↓               ↓               ↓

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”

β”‚  Temporal   β”‚ β”‚  Memory     β”‚ β”‚ Importance  β”‚

β”‚  Position   β”‚ β”‚  Type       β”‚ β”‚ Prediction  β”‚

β”‚  Embeddings β”‚ β”‚  Embeddings β”‚ β”‚ Head        β”‚

β”‚  (10 Γ— 768) β”‚ β”‚  (4 Γ— 768)  β”‚ β”‚ (768β†’1)     β”‚

β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

       β”‚               β”‚               β”‚

       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

                       ↓

          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”

          β”‚  Projection Layer   β”‚

          β”‚  (768 β†’ 768)        β”‚

          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

                       ↓

          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”

          β”‚  L2 Normalization   β”‚

          β”‚  768-dim Embedding  β”‚

          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

```

### Novel Components

| Component | Purpose | How It Helps |
|-----------|---------|--------------|
| **Temporal Embeddings** | Encodes memory age (today, this week, last month, etc.) | "Yesterday" queries match recent memories |
| **Memory Type Embeddings** | Distinguishes episodic/semantic/procedural | "What do I like?" matches preferences, not events |
| **Importance Head** | Auxiliary task predicting memory priority | Helps learn better representations |

### Temporal Buckets

```

Bucket 0: Today (0-1 days)

Bucket 1: Recent (1-3 days)

Bucket 2: This week (3-7 days)

Bucket 3: Last week (7-14 days)

Bucket 4: This month (14-30 days)

Bucket 5: Last month (30-60 days)

Bucket 6: Few months (60-90 days)

Bucket 7: Half year (90-180 days)

Bucket 8: This year (180-365 days)

Bucket 9: Long ago (365+ days)

```

### Memory Types

```

Type 0: Episodic   β†’ Events, conversations ("We discussed X yesterday")

Type 1: Semantic   β†’ Facts, preferences ("User likes Python")

Type 2: Procedural β†’ Instructions ("To deploy, run npm build")

Type 3: Unknown    β†’ Fallback

```

---

## πŸŽ“ Training Details

| Aspect | Details |
|--------|---------|
| **Base Model** | answerdotai/ModernBERT-base (110M params) |
| **Training Data** | 500K synthetic agent memory samples |
| **Memory Distribution** | Episodic (40%), Semantic (35%), Procedural (25%) |
| **Loss Function** | Multiple Negatives Ranking Loss + Importance MSE |
| **Hard Negatives** | 7 per sample (5 types: temporal, type confusion, topic drift, etc.) |
| **Batch Size** | 16-32 per GPU |
| **Hardware** | 2Γ— NVIDIA RTX 6000 Ada (48GB each) |
| **Training Time** | ~12 hours |
| **Precision** | FP16 Mixed Precision |
| **Final Val Loss** | 0.877 |

---

## πŸ—οΈ Use Cases

### 1. AI Agents with Long-Term Memory

```python

# Store memories with metadata

agent.remember(

    text="User is allergic to peanuts",

    memory_type="semantic",

    importance=10,  # Critical medical info!

)



# Later, when discussing food...

relevant_memories = agent.recall("What should I know about the user's diet?")

# Returns: "User is allergic to peanuts" (even if stored months ago)

```

### 2. RAG Systems for Conversational AI

```python

# Better retrieval for chatbots

query = "What did we talk about in our last meeting?"

# AgentRank returns recent, relevant conversations

# Generic embedders return random topically-similar docs

```

### 3. Personal Knowledge Bases

```python

# User's notes and preferences

memories = [

    "I prefer dark mode in all apps",

    "My morning routine starts at 6 AM",

    "Important: Tax deadline April 15",

]

# AgentRank properly handles time-sensitive queries

```

---

## πŸ†š When to Use AgentRank vs Others

| Use Case | Best Model |
|----------|------------|
| **AI agents with memory** | βœ… AgentRank |
| **Time-sensitive retrieval** | βœ… AgentRank |
| **Conversational AI** | βœ… AgentRank |
| General document search | OpenAI / Cohere |
| Code search | CodeBERT |
| Scientific papers | SciBERT |

---

## πŸ“ Model Family

| Model | Parameters | Speed | Quality | Best For |
|-------|------------|-------|---------|----------|
| [agentrank-small](https://huggingface.co/vrushket/agentrank-small) | 33M | ⚑⚑⚑ Fast | Good | Real-time agents, edge |
| **agentrank-base** | 149M | ⚑⚑ Medium | **Best** | Quality-critical apps |
| agentrank-reranker (coming) | 149M | ⚑ Slower | Superior | Two-stage retrieval |

---

## πŸ“š Citation

```bibtex

@misc{agentrank2024,

  author = {Vrushket More},

  title = {AgentRank: Embedding Models for AI Agent Memory Retrieval},

  year = {2024},

  publisher = {HuggingFace},

  url = {https://huggingface.co/vrushket/agentrank-base}

}

```

---

## 🀝 Community & Support

- πŸ› **Issues**: [GitHub Issues](https://github.com/vmore2/AgentRank-base/issues)
- πŸ’¬ **Discussions**: [HuggingFace Community](https://huggingface.co/vrushket/agentrank-base/discussions)
- πŸ“§ **Contact**: vrushket2604@gmail.com

---

## πŸ“„ License

Apache 2.0 - **Free for commercial use!**

---

<div align="center">

### ⭐ If AgentRank helps your project, please star the repo!

**Built with ❀️ for the AI agent community**

</div>