---
license: apache-2.0
language:
- en
tags:
- text-classification
- distilbert
- query-complexity
- agent-routing
- llm-routing
- ai-agents
- tool-use
pipeline_tag: text-classification
---

# QueryComplexityRouter

A fast, lightweight 3-class classifier that decides **how much LLM power a query needs** — before you spend tokens on it.

Built on DistilBERT (66M params), fine-tuned to classify any user message into one of three complexity tiers:

| Label | Meaning | Suggested Action |
|---|---|---|
| `no_llm` | Answerable with rules, lookup, or regex | Skip the LLM entirely |
| `small_llm` | A 1–3B model (Phi-3, Gemma-2B) is sufficient | Route to a cheap local model |
| `large_llm` | Requires 7B+ or frontier model (GPT-4, Claude) | Route to powerful model |

## Why This Exists

Running every query through a frontier LLM is expensive and slow. But you also don't want to under-serve complex queries with a tiny model.

**QueryComplexityRouter** sits at the top of your pipeline and makes this decision in **~10ms on CPU** — before any LLM call is made.

Pair it with [AgentIntentRouter](https://huggingface.co/tripathyShaswata/AgentIntentRouter) for a full 2-stage routing pipeline:

```
User Message
    │
    ▼
AgentIntentRouter          ← What does the user want? (code, search, chat, ...)
    │
    ▼
QueryComplexityRouter      ← How hard is it? (no_llm / small_llm / large_llm)
    │
    ▼
Route to the right tool/model
```

## Quick Start

```python
from transformers import pipeline

router = pipeline("text-classification", model="tripathyShaswata/QueryComplexityRouter")

# Single prediction
result = router("What is 15% of 4500?")
print(result)
# [{'label': 'no_llm', 'score': 0.98}]

# Batch
messages = [
    "What is the capital of France?",           # no_llm
    "Explain recursion in simple terms.",        # small_llm
    "Write a 1000-word blog post about AI.",     # large_llm
    "Design a distributed caching system.",      # large_llm
    "Fix this bug: def add(a,b): return a-b",   # small_llm
]
results = router(messages)
for msg, res in zip(messages, results):
    print(f"  {res['label']:>12} ({res['score']:.2f}) — {msg}")
```

## 2-Stage Routing Pipeline

```python
from transformers import pipeline

intent_router = pipeline("text-classification", model="tripathyShaswata/AgentIntentRouter")
complexity_router = pipeline("text-classification", model="tripathyShaswata/QueryComplexityRouter")

def route(user_message: str):
    intent = intent_router(user_message)[0]
    complexity = complexity_router(user_message)[0]

    print(f"Intent:     {intent['label']} ({intent['score']:.2f})")
    print(f"Complexity: {complexity['label']} ({complexity['score']:.2f})")

    if complexity["label"] == "no_llm":
        return handle_with_rules(user_message, intent["label"])
    elif complexity["label"] == "small_llm":
        return call_small_model(user_message)
    else:
        return call_large_model(user_message)
```

## Complexity Labels

### `no_llm` — No LLM needed
- Simple math: *"What is 42 * 7?"*
- Unit conversion: *"Convert 100km to miles"*
- Factual lookup: *"What is the capital of Japan?"*
- Date/time: *"What day is March 15 2026?"*
- Simple commands: *"Set a timer for 5 minutes"*

### `small_llm` — 1–3B model sufficient
- Short summarization: *"Summarize this paragraph..."*
- Basic explanation: *"Explain recursion to a 10-year-old"*
- Simple code: *"Write a Python function to reverse a string"*
- Short generation: *"Write a one-line bio for a software engineer"*
- Simple classification: *"Is this email spam?"*

### `large_llm` — 7B+ / frontier model required
- Deep reasoning: *"Analyze the ethical implications of AI replacing jobs"*
- Long-form writing: *"Write a 1000-word blog post about quantum computing"*
- Complex code: *"Build a REST API with auth, error handling, and tests"*
- Multi-doc synthesis: *"Given these 5 documents, synthesize an answer..."*
- System design: *"Design a distributed caching system with eventual consistency"*

## Performance

- **Inference speed**: ~10ms on CPU, ~2ms on GPU
- **Model size**: ~260MB (DistilBERT-base)

### Evaluation Results

Results on held-out test set:

| Metric | Score |
|---|---|
| Accuracy | ~0.99 |
| F1 (weighted) | ~0.99 |

Per-class performance:

| Class | Precision | Recall | F1 |
|---|---|---|---|
| no_llm | ~1.00 | ~1.00 | ~1.00 |
| small_llm | ~0.98 | ~0.98 | ~0.98 |
| large_llm | ~0.99 | ~0.99 | ~0.99 |

> Note: Results on synthetic test data from the same distribution as training. Real-world performance will vary. Use the confidence score threshold to handle ambiguous inputs gracefully.

## Training Details

- **Base model**: distilbert-base-uncased
- **Training data**: ~1,400 synthetic examples per class (~4,200 total), template-generated with natural language variation
- **Epochs**: 5 (with early stopping, patience=2)
- **Learning rate**: 2e-5
- **Batch size**: 32
- **Max sequence length**: 128

## Use in Agent Pipelines

```python
COMPLEXITY_THRESHOLDS = {
    "no_llm": 0.7,
    "small_llm": 0.6,
    "large_llm": 0.6,
}

def smart_route(message: str):
    result = router(message)[0]
    label, score = result["label"], result["score"]

    if score < COMPLEXITY_THRESHOLDS[label]:
        # Low confidence — default to large_llm for safety
        label = "large_llm"

    return label
```

## Limitations

- Trained on English text only
- Template-generated data may not cover all edge cases
- Borderline queries (e.g., *"explain quantum entanglement"*) may get lower confidence — use threshold fallback
- Complexity is query-level only; does not account for context window length or domain expertise needed

## Related Models

- [tripathyShaswata/AgentIntentRouter](https://huggingface.co/tripathyShaswata/AgentIntentRouter) — companion intent classifier (8 categories, ~10ms on CPU)

## License

Apache 2.0 — use it however you want, commercial included.

## Citation

If this helps you, a star is appreciated!