chayan

File size: 5,499 Bytes

16c3835
d0f3f95
 
c1cc6a2
d0f3f95
 
 
 
16c3835
c1cc6a2
 
 
 
 
 
16c3835
 
c1cc6a2
16c3835
d0f3f95
 
 
 
 
 
 
 
16c3835
358fa63
16c3835
358fa63
d0f3f95
 
 
 
0e55651
358fa63
c1cc6a2
358fa63
d0f3f95
 
c1cc6a2
358fa63
c1cc6a2
358fa63
16c3835
 
 
 
 
c1cc6a2
 
 
358fa63
c1cc6a2
16c3835
358fa63
c1cc6a2
 
16c3835
358fa63
 
16c3835
 
358fa63
16c3835
 
358fa63
c1cc6a2
 
 
 
 
 
 
 
358fa63
c1cc6a2
 
16c3835
358fa63
c1cc6a2
358fa63
d0f3f95
 
 
c1cc6a2
358fa63
c1cc6a2
358fa63
 
 
 
 
 
16c3835
358fa63
c1cc6a2
358fa63
d0f3f95
 
 
 
c1cc6a2
358fa63
16c3835
358fa63
d0f3f95
 
 
 
0e55651
358fa63
0e55651
358fa63
16c3835
358fa63
c1cc6a2
358fa63
0e55651
358fa63
 
 
 
 
 
 
8ea3ca1
358fa63
0e55651
358fa63
d0f3f95
 
 
 
c1cc6a2
358fa63
c1cc6a2
358fa63
c1cc6a2
358fa63
c1cc6a2
358fa63
 
c1cc6a2
358fa63
 
 
c1cc6a2
358fa63
 
16c3835
 
 
d0f3f95
 
 
 
16c3835
 
 
 
ec5143a
 
 
16c3835
ec5143a
 
16c3835
d0f3f95

---
language:
- en
library_name: adaptive-classifier
license: apache-2.0
metrics:
- accuracy
pipeline_tag: text-classification
tags:
- llm
- routing
- multi-model
- bert
- router-arena
- model-selection
---

# Chayan: Multi-Model LLM Router

This model is a high-performance LLM router presented in the paper [RouterArena: An Open Platform for Comprehensive Comparison of LLM Routers](https://huggingface.co/papers/2510.00202).

-   📚 Paper (Hugging Face): [RouterArena: An Open Platform for Comprehensive Comparison of LLM Routers](https://huggingface.co/papers/2510.00202)
-   📚 Paper (arXiv): https://arxiv.org/abs/2510.00202
-   💻 Library Code: https://github.com/codelion/adaptive-classifier
-   🌐 RouterArena Project Page: https://routeworks.github.io/

**Chayan** intelligently routes between 4 models (gpt-4o-mini, gemini-2.5-flash-lite, gemini-2.5-flash, and gpt-4o) to optimize the accuracy-cost tradeoff.

## 🏆 RouterArena Performance

**Official Leaderboard Results** (8,400 queries):
-   🥇 **#1 Optimal Accuracy Score: 88.7%** - SOTA! (Best routing decision quality)
-   🥈 **#2 Optimal Selection Score: 43.0%** - Silver! (Second-best model selection)
-   **#7 Overall** (#5 open-source): 64.9% accuracy, 63.8 arena score
-   **$0.60 per 1K queries** - Cost-efficient routing

![RouterArena Leaderboard](routerarena_leaderboard.png)

**What do these metrics mean?**
-   **Optimal Accuracy**: When Chayan routes to a model, that model gives the correct answer 88.7% of the time
-   **Optimal Selection**: Chayan selects the best available model 43% of the time

View full leaderboard: [RouterArena](https://routeworks.github.io/) | [PR #24](https://github.com/RouteWorks/RouterArena/pull/24)

## Quick Start

```bash
pip install adaptive-classifier
```

```python
from adaptive_classifier import AdaptiveClassifier

# Load router
router = AdaptiveClassifier.load("adaptive-classifier/chayan")

# Get routing decision
query = "What is the capital of France?"
predictions = router.predict(query, k=4)

# Route to top model
selected_model = predictions[0][0]  # e.g., "openai/gpt-4o-mini"
```

### Recommended: Use with Calibration

```python
# Apply calibration factors for best performance
calibration = {
    "openai/gpt-4o-mini": 0.9,
    "google/gemini-2.5-flash-lite": 1.5,
    "google/gemini-2.5-flash": 1.8,
    "openai/gpt-4o": 1.5
}

predictions = router.predict(query, k=4)
calibrated_scores = {model: score * calibration[model] for model, score in predictions}
selected_model = max(calibrated_scores.items(), key=lambda x: x[1])[0]
```

## Architecture

**Core Components:**
-   **Base Model**: BERT-base-uncased embeddings
-   **Classifier**: Adaptive K-NN with prototype memory (FAISS-backed)
-   **Innovation**: Calibrated confidence scores to correct training data imbalance

**Supported Models:**

| Model | Use Case | Cost/1M tokens |
|-------|----------|----------------|
| openai/gpt-4o-mini | Simple queries | $0.15 |
| google/gemini-2.5-flash-lite | Medium complexity | $0.075 |
| google/gemini-2.5-flash | Higher complexity | $0.30 |
| openai/gpt-4o | Complex queries | $2.50 |

## How It Works

### Training
-   **Dataset**: RouterArena sub_10 (809 queries)
-   **Oracle Labels**: 4-model cascade strategy (select cheapest successful model)
-   **Training Time**: 19.2 minutes
-   **Method**: K-NN classifier with 3000 prototypes, temperature 0.4

### The Calibration Breakthrough

The uncalibrated router achieved 61.76% accuracy but was biased toward gpt-4o-mini (83% routing). This happened because the training data had class imbalance:
-   57% gpt-4o-mini examples
-   27% gpt-4o examples
-   12% gemini-flash-lite examples
-   4% gemini-flash examples

**Solution**: Apply post-training calibration factors to correct the bias without retraining.

**Result**: +7.29pp improvement (61.76% → 69.05% on sub_10 benchmark)

## Performance Benchmarks

**Sub_10 Benchmark (809 queries):**

| Router | Accuracy | Cost/1K |
|--------|----------|---------|
| All gpt-4o-mini (baseline) | 56.98% | $0.088 |
| 2-model router | 61.43% | $0.217 |
| Chayan (uncalibrated) | 61.76% | $0.269 |
| **Chayan (calibrated)** | **69.05%** | **$0.333** |
| Perfect 2-model oracle | 69.84% | $0.784 |

**Key Insight**: Chayan achieves 99% of perfect oracle performance at 57% lower cost.

**Full Dataset (8,400 queries):**
-   **Optimal Accuracy**: 88.7% (🥇 #1)
-   **Optimal Selection**: 43.0% (🥈 #2)
-   **Overall Accuracy**: 64.9% (#7 overall, #5 open-source)
-   **Cost**: $0.60/1K queries

## Advanced Usage

### Feature Augmentation

Chayan was trained with query features prepended as tokens:

```python
from adaptive_classifier.complexity_features import augment_query_with_features

query = "What is 2+2?"
augmented = augment_query_with_features(query)
# Returns: "[LEN:12][WORDS:3][MATH:1][SENT:1][MC:0] What is 2+2?"

predictions = router.predict(augmented, k=4)
```

## Limitations

-   Calibration factors optimized on RouterArena sub_10; may require adjustment for other domains
-   Requires the 4 specific models to be available via API
-   Performance depends on query distribution similar to RouterArena benchmark
-   Cost estimates assume ~500 tokens per query

## Citation

```bibtex
@software{adaptive_classifier,
  title = {Adaptive Classifier: Dynamic Text Classification with Continuous Learning},
  author = {Sharma, Asankhaya},
  year = {2025},
  publisher = {GitHub},
  url = {https://github.com/codelion/adaptive-classifier}
}
```