---
license: apache-2.0
language:
- en
library_name: transformers
tags:
- text-classification
- image-optimization
- technique-routing
- headroom
datasets:
- custom
metrics:
- accuracy
base_model: microsoft/MiniLM-L12-H384-uncased
pipeline_tag: text-classification
---

# Technique Router (MiniLM)

A fine-tuned MiniLM classifier that routes image queries to optimal compression techniques for the [Headroom SDK](https://github.com/headroom-ai/headroom).

## Model Description

This model classifies natural language queries about images into one of four optimization techniques:

| Technique | Token Savings | Best For |
|-----------|--------------|----------|
| `transcode` | ~99% | Text extraction, OCR tasks |
| `crop` | 50-90% | Region-specific queries |
| `full_low` | ~87% | General understanding |
| `preserve` | 0% | Fine details, counting |

## Training Data

- **Base examples**: 145 human-written queries
- **Expanded dataset**: 1,157 examples (via template expansion + synonyms)
- **Split**: 85% train, 15% validation

## Performance

- **Validation Accuracy**: 93.7%
- **Model Size**: ~128MB

### Per-Class Performance

| Class | Precision | Recall | F1-Score |
|-------|-----------|--------|----------|
| transcode | 0.95 | 0.92 | 0.93 |
| crop | 0.92 | 0.97 | 0.94 |
| preserve | 0.97 | 0.90 | 0.93 |
| full_low | 0.89 | 0.96 | 0.92 |

## Usage

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model
model_id = "chopratejas/technique-router"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)
model.eval()

# Classify a query
query = "What brand is the TV?"
inputs = tokenizer(query, return_tensors="pt", truncation=True, padding=True)

with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=-1)
    pred_id = torch.argmax(probs, dim=-1).item()
    confidence = probs[0][pred_id].item()

technique = model.config.id2label[pred_id]
print(f"{query} -> {technique} ({confidence:.0%})")
# Output: What brand is the TV? -> preserve (73%)
```

## With Headroom SDK

```python
from headroom.image import TrainedRouter

router = TrainedRouter()
decision = router.classify(image_bytes, "What brand is the TV?")
print(decision.technique)  # Technique.PRESERVE
```

## Intended Use

This model is designed for:
- Routing image analysis queries to optimal compression techniques
- Reducing token usage in vision-language model applications
- Enabling cost-effective image understanding at scale

## Limitations

- English language only
- Optimized for common image understanding queries
- May not generalize well to domain-specific terminology

## Citation

```bibtex
@misc{headroom-technique-router,
  title={Technique Router for Image Token Optimization},
  author={Headroom AI},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/chopratejas/technique-router}
}
```