README.md · vrushket/agentrank-small at main

agentrank-small / README.md

vrushket

Upload AgentRank model

191850d verified 16 days ago

preview code

raw

history blame contribute delete

6.86 kB

	---
	license: apache-2.0
	language:
	- en
	tags:
	- sentence-transformers
	- embeddings
	- retrieval
	- agents
	- memory
	- rag
	- semantic-search
	library_name: transformers
	pipeline_tag: sentence-similarity
	datasets:
	- custom
	metrics:
	- mrr
	- recall
	- ndcg
	model-index:
	- name: agentrank-small
	results:
	- task:
	type: retrieval
	name: Agent Memory Retrieval
	metrics:
	- type: mrr
	value: 0.6375
	name: MRR
	- type: recall
	value: 0.4460
	name: Recall@1
	- type: recall
	value: 0.9740
	name: Recall@5
	- type: ndcg
	value: 0.6797
	name: NDCG@10
	---

	# AgentRank-Small: Embedding Model for AI Agent Memory Retrieval

	<p align="center">
	<img src="https://img.shields.io/badge/MRR-0.6375-brightgreen" alt="MRR">
	<img src="https://img.shields.io/badge/Recall%405-97.4%25-blue" alt="Recall@5">
	<img src="https://img.shields.io/badge/Parameters-33M-orange" alt="Parameters">
	<img src="https://img.shields.io/badge/License-Apache%202.0-green" alt="License">
	</p>

	AgentRank is the first embedding model family specifically designed for AI agent memory retrieval. Unlike general-purpose embedders, AgentRank understands temporal context, memory types, and importance - critical for agents that need to remember past interactions.

	## 🚀 Key Results

	\| Model \| MRR \| Recall@1 \| Recall@5 \| NDCG@10 \|
	\|-------\|-----\|----------\|----------\|---------\|
	\| AgentRank-Small \| 0.6375 \| 0.4460 \| 0.9740 \| 0.6797 \|
	\| all-MiniLM-L6-v2 \| 0.5297 \| 0.3720 \| 0.7520 \| 0.6370 \|
	\| all-mpnet-base-v2 \| 0.5351 \| 0.3660 \| 0.7960 \| 0.6335 \|

	+20% MRR improvement over base MiniLM model!

	## 🎯 Why AgentRank?

	AI agents need memory that understands:

	\| Challenge \| General Embedders \| AgentRank \|
	\|-----------\|-------------------\|-----------\|
	\| "What did I say yesterday?" \| ❌ No temporal awareness \| ✅ Temporal embeddings \|
	\| "What's my preference?" \| ❌ Mixes with events \| ✅ Memory type awareness \|
	\| "What's most important?" \| ❌ No priority \| ✅ Importance prediction \|

	## 📦 Installation

	```bash
	pip install transformers torch
	```

	## 💻 Usage

	### Basic Usage

	```python
	from transformers import AutoModel, AutoTokenizer
	import torch

	# Load model
	model = AutoModel.from_pretrained("vrushket/agentrank-small")
	tokenizer = AutoTokenizer.from_pretrained("vrushket/agentrank-small")

	def encode(texts):
	inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt")
	with torch.no_grad():
	outputs = model(**inputs)
	embeddings = outputs.last_hidden_state.mean(dim=1)
	embeddings = torch.nn.functional.normalize(embeddings, p=2, dim=1)
	return embeddings

	# Encode memories and query
	memories = [
	"User prefers Python over JavaScript",
	"User asked about machine learning yesterday",
	"User is working on a web project",
	]
	query = "What programming language does the user like?"

	memory_embeddings = encode(memories)
	query_embedding = encode([query])

	# Compute similarities
	similarities = torch.mm(query_embedding, memory_embeddings.T)
	print(f"Most relevant: {memories[similarities.argmax()]}")
	# Output: "User prefers Python over JavaScript"
	```

	### With Temporal & Memory Type Metadata (Full Power)

	```python
	# For full AgentRank features including temporal awareness:
	# pip install agentrank (coming soon!)

	from agentrank import AgentRankEmbedder

	model = AgentRankEmbedder.from_pretrained("vrushket/agentrank-small")

	# Encode with metadata
	embedding = model.encode(
	"User mentioned they prefer morning meetings",
	days_ago=3, # Memory is 3 days old
	memory_type="semantic" # It's a preference, not an event
	)
	```

	## 🏗️ Architecture

	AgentRank-Small is based on `all-MiniLM-L6-v2` with novel additions:

	```
	┌─────────────────────────────────────────┐
	│ MiniLM Transformer Encoder (6 layers) │
	└─────────────────────────────────────────┘
	│
	┌───────────────┼───────────────┐
	↓ ↓ ↓
	┌─────────┐ ┌──────────┐ ┌───────────┐
	│ Temporal │ │ Memory │ │ Importance│
	│ Position │ │ Type │ │ Prediction│
	│ Embed │ │ Embed │ │ Head │
	└─────────┘ └──────────┘ └───────────┘
	│ │ │
	└───────────────┼───────────────┘
	↓
	┌─────────────────┐
	│ L2 Normalized │
	│ 384-dim Embedding│
	└─────────────────┘
	```

	Novel Features:
	- Temporal Position Embeddings: 10 learnable buckets (today, 1-3 days, week, month, etc.)
	- Memory Type Embeddings: Episodic, Semantic, Procedural
	- Importance Prediction Head: Auxiliary task during training

	## 🎓 Training

	- Dataset: 500K synthetic agent memory samples
	- Memory Types: Episodic (40%), Semantic (35%), Procedural (25%)
	- Loss: Multiple Negatives Ranking Loss + Importance MSE
	- Hard Negatives: 5 types (temporal, type confusion, topic drift, etc.)
	- Hardware: NVIDIA RTX 6000 Ada (48GB) with FP16

	## 📊 Benchmarks

	Evaluated on AgentMemBench (500 test samples, 8 candidates each):

	\| Metric \| AgentRank-Small \| MiniLM \| Improvement \|
	\|--------\|-----------------\|--------\|-------------\|
	\| MRR \| 0.6375 \| 0.5297 \| +20.4% \|
	\| Recall@1 \| 0.4460 \| 0.3720 \| +19.9% \|
	\| Recall@5 \| 0.9740 \| 0.7520 \| +29.5% \|
	\| NDCG@10 \| 0.6797 \| 0.6370 \| +6.7% \|

	## 🔜 Coming Soon

	- AgentRank-Base: 110M params, even better performance
	- AgentRank-Reranker: Cross-encoder for top-k refinement
	- Python Package: `pip install agentrank`

	## 📚 Citation

	```bibtex
	@misc{agentrank2024,
	author = {Vrushket More},
	title = {AgentRank: Embedding Models for AI Agent Memory Retrieval},
	year = {2024},
	publisher = {HuggingFace},
	url = {https://huggingface.co/vrushket/agentrank-small}
	}
	```

	## 📄 License

	Apache 2.0 - Free for commercial use!

	## 🤝 Acknowledgments

	Built on top of [sentence-transformers](https://www.sbert.net/) and [MiniLM](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2).