File size: 5,225 Bytes
369cb70
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
---
license: apache-2.0
base_model: distilbert-base-uncased
tags:
  - text-classification
  - intent-detection
  - agent-routing
  - mcp
  - ai-agents
  - distilbert
  - tool-use
datasets:
  - custom
language:
  - en
metrics:
  - accuracy
  - f1
pipeline_tag: text-classification
library_name: transformers
---

# AgentIntentRouter

A fast, lightweight intent classifier for AI agent and MCP tool routing. Given a user message, it predicts which tool or capability the agent should invoke — in under 50ms on CPU.

Built on DistilBERT (66M params), fine-tuned on 12K+ diverse examples across 8 intent categories.

## Why This Exists

Every agent framework (LangChain, LangGraph, CrewAI, AutoGen) wastes an entire LLM call just to figure out *what the user wants*. That's 1-3 seconds and ~$0.01 per request — just for routing.

AgentIntentRouter replaces that first LLM call with a 66M classifier that runs in **~10ms on CPU** and **~2ms on GPU**. Use it as the first step in your agent pipeline to instantly route to the right tool.

## Intent Categories

| Label | Description | Example |
|-------|-------------|---------|
| `code_generation` | User wants code written, debugged, or refactored | "Write a Python function to parse CSV" |
| `web_search` | User wants to find information online | "What's the latest news on AI regulation" |
| `math_calculation` | User needs computation or conversion | "Calculate 15% of 4500" |
| `file_operation` | User wants to read, write, or manage files | "Read the config.json file" |
| `api_call` | User wants to interact with an external API | "Send a Slack message to the team" |
| `creative_writing` | User wants text composed or drafted | "Write a professional email to the client" |
| `data_analysis` | User wants data interpreted or compared | "Compare React vs Vue performance" |
| `general_chat` | Casual conversation, greetings, feedback | "Hey, how are you?" |

## Quick Start

```python
from transformers import pipeline

router = pipeline("text-classification", model="tripathyShaswata/AgentIntentRouter")

# Single prediction
result = router("Write a Python function to sort a list")
print(result)
# [{'label': 'code_generation', 'score': 0.98}]

# Batch prediction
messages = [
    "Search for the latest AI papers",
    "What's 25% of 1200?",
    "Draft an email to my boss about the deadline",
    "Hello!",
]
results = router(messages)
for msg, res in zip(messages, results):
    print(f"  {res['label']:>20} ({res['score']:.2f}) — {msg}")
```

## Use as Agent Router

```python
from transformers import pipeline

router = pipeline("text-classification", model="tripathyShaswata/AgentIntentRouter")

TOOL_MAP = {
    "code_generation": handle_code_request,
    "web_search": handle_search,
    "math_calculation": handle_calculation,
    "file_operation": handle_file_ops,
    "api_call": handle_api_call,
    "creative_writing": handle_writing,
    "data_analysis": handle_analysis,
    "general_chat": handle_chat,
}

def route(user_message: str):
    intent = router(user_message)[0]
    
    if intent["score"] < 0.5:
        # Low confidence — fall back to LLM for routing
        return fallback_llm_route(user_message)
    
    handler = TOOL_MAP[intent["label"]]
    return handler(user_message)
```

## Performance

- **Inference speed:** ~10ms on CPU, ~2ms on GPU
- **Model size:** ~260MB (DistilBERT-base)
- **Accuracy:** 100% on test set

### Evaluation Results

*Results on held-out test set (1,124 examples):*

| Metric | Score |
|--------|-------|
| Accuracy | 1.000 |
| F1 (weighted) | 1.000 |

*Per-class performance:*

| Intent | Precision | Recall | F1 | Support |
|--------|-----------|--------|-----|---------|
| code_generation | 1.000 | 1.000 | 1.000 | 130 |
| web_search | 1.000 | 1.000 | 1.000 | 151 |
| math_calculation | 1.000 | 1.000 | 1.000 | 153 |
| file_operation | 1.000 | 1.000 | 1.000 | 154 |
| api_call | 1.000 | 1.000 | 1.000 | 133 |
| creative_writing | 1.000 | 1.000 | 1.000 | 160 |
| data_analysis | 1.000 | 1.000 | 1.000 | 168 |
| general_chat | 1.000 | 1.000 | 1.000 | 75 |

> **Note:** These results are on synthetic test data from the same distribution as training. Real-world performance will vary. Use the confidence score threshold to handle ambiguous inputs gracefully.

## Training Details

- **Base model:** distilbert-base-uncased
- **Training data:** 8,987 examples (synthetic, template-generated with natural language variation)
- **Validation:** 1,123 examples
- **Test:** 1,124 examples
- **Epochs:** 3 (with early stopping, patience=2)
- **Learning rate:** 2e-5
- **Batch size:** 32
- **Max sequence length:** 128
- **Training time:** ~100 seconds on NVIDIA RTX 4070
- **Loss:** 0.0015 (training) / 0.0017 (validation)

## Limitations

- Trained on English text only
- Template-generated training data may not cover all edge cases
- Ambiguous messages (e.g., "help me with the API code") may get lower confidence scores — use the confidence threshold to fall back to an LLM
- Not designed for multi-intent messages (e.g., "search for X and write code for Y")

## License

Apache 2.0 — use it however you want, commercial included.

## Citation

If you use this model, a star on the repo is appreciated!