Text Classification
Transformers
Safetensors
English
distilbert
intent-detection
agent-routing
mcp
ai-agents
tool-use
text-embeddings-inference
Instructions to use tripathyShaswata/AgentIntentRouter with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use tripathyShaswata/AgentIntentRouter with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="tripathyShaswata/AgentIntentRouter")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("tripathyShaswata/AgentIntentRouter") model = AutoModelForSequenceClassification.from_pretrained("tripathyShaswata/AgentIntentRouter") - Notebooks
- Google Colab
- Kaggle
File size: 5,225 Bytes
369cb70 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 | ---
license: apache-2.0
base_model: distilbert-base-uncased
tags:
- text-classification
- intent-detection
- agent-routing
- mcp
- ai-agents
- distilbert
- tool-use
datasets:
- custom
language:
- en
metrics:
- accuracy
- f1
pipeline_tag: text-classification
library_name: transformers
---
# AgentIntentRouter
A fast, lightweight intent classifier for AI agent and MCP tool routing. Given a user message, it predicts which tool or capability the agent should invoke — in under 50ms on CPU.
Built on DistilBERT (66M params), fine-tuned on 12K+ diverse examples across 8 intent categories.
## Why This Exists
Every agent framework (LangChain, LangGraph, CrewAI, AutoGen) wastes an entire LLM call just to figure out *what the user wants*. That's 1-3 seconds and ~$0.01 per request — just for routing.
AgentIntentRouter replaces that first LLM call with a 66M classifier that runs in **~10ms on CPU** and **~2ms on GPU**. Use it as the first step in your agent pipeline to instantly route to the right tool.
## Intent Categories
| Label | Description | Example |
|-------|-------------|---------|
| `code_generation` | User wants code written, debugged, or refactored | "Write a Python function to parse CSV" |
| `web_search` | User wants to find information online | "What's the latest news on AI regulation" |
| `math_calculation` | User needs computation or conversion | "Calculate 15% of 4500" |
| `file_operation` | User wants to read, write, or manage files | "Read the config.json file" |
| `api_call` | User wants to interact with an external API | "Send a Slack message to the team" |
| `creative_writing` | User wants text composed or drafted | "Write a professional email to the client" |
| `data_analysis` | User wants data interpreted or compared | "Compare React vs Vue performance" |
| `general_chat` | Casual conversation, greetings, feedback | "Hey, how are you?" |
## Quick Start
```python
from transformers import pipeline
router = pipeline("text-classification", model="tripathyShaswata/AgentIntentRouter")
# Single prediction
result = router("Write a Python function to sort a list")
print(result)
# [{'label': 'code_generation', 'score': 0.98}]
# Batch prediction
messages = [
"Search for the latest AI papers",
"What's 25% of 1200?",
"Draft an email to my boss about the deadline",
"Hello!",
]
results = router(messages)
for msg, res in zip(messages, results):
print(f" {res['label']:>20} ({res['score']:.2f}) — {msg}")
```
## Use as Agent Router
```python
from transformers import pipeline
router = pipeline("text-classification", model="tripathyShaswata/AgentIntentRouter")
TOOL_MAP = {
"code_generation": handle_code_request,
"web_search": handle_search,
"math_calculation": handle_calculation,
"file_operation": handle_file_ops,
"api_call": handle_api_call,
"creative_writing": handle_writing,
"data_analysis": handle_analysis,
"general_chat": handle_chat,
}
def route(user_message: str):
intent = router(user_message)[0]
if intent["score"] < 0.5:
# Low confidence — fall back to LLM for routing
return fallback_llm_route(user_message)
handler = TOOL_MAP[intent["label"]]
return handler(user_message)
```
## Performance
- **Inference speed:** ~10ms on CPU, ~2ms on GPU
- **Model size:** ~260MB (DistilBERT-base)
- **Accuracy:** 100% on test set
### Evaluation Results
*Results on held-out test set (1,124 examples):*
| Metric | Score |
|--------|-------|
| Accuracy | 1.000 |
| F1 (weighted) | 1.000 |
*Per-class performance:*
| Intent | Precision | Recall | F1 | Support |
|--------|-----------|--------|-----|---------|
| code_generation | 1.000 | 1.000 | 1.000 | 130 |
| web_search | 1.000 | 1.000 | 1.000 | 151 |
| math_calculation | 1.000 | 1.000 | 1.000 | 153 |
| file_operation | 1.000 | 1.000 | 1.000 | 154 |
| api_call | 1.000 | 1.000 | 1.000 | 133 |
| creative_writing | 1.000 | 1.000 | 1.000 | 160 |
| data_analysis | 1.000 | 1.000 | 1.000 | 168 |
| general_chat | 1.000 | 1.000 | 1.000 | 75 |
> **Note:** These results are on synthetic test data from the same distribution as training. Real-world performance will vary. Use the confidence score threshold to handle ambiguous inputs gracefully.
## Training Details
- **Base model:** distilbert-base-uncased
- **Training data:** 8,987 examples (synthetic, template-generated with natural language variation)
- **Validation:** 1,123 examples
- **Test:** 1,124 examples
- **Epochs:** 3 (with early stopping, patience=2)
- **Learning rate:** 2e-5
- **Batch size:** 32
- **Max sequence length:** 128
- **Training time:** ~100 seconds on NVIDIA RTX 4070
- **Loss:** 0.0015 (training) / 0.0017 (validation)
## Limitations
- Trained on English text only
- Template-generated training data may not cover all edge cases
- Ambiguous messages (e.g., "help me with the API code") may get lower confidence scores — use the confidence threshold to fall back to an LLM
- Not designed for multi-intent messages (e.g., "search for X and write code for Y")
## License
Apache 2.0 — use it however you want, commercial included.
## Citation
If you use this model, a star on the repo is appreciated!
|