technique-router / README.md
chopratejas's picture
Upload folder using huggingface_hub
639f08a verified
---
license: apache-2.0
language:
- en
library_name: transformers
tags:
- text-classification
- image-optimization
- technique-routing
- headroom
datasets:
- custom
metrics:
- accuracy
base_model: microsoft/MiniLM-L12-H384-uncased
pipeline_tag: text-classification
---
# Technique Router (MiniLM)
A fine-tuned MiniLM classifier that routes image queries to optimal compression techniques for the [Headroom SDK](https://github.com/headroom-ai/headroom).
## Model Description
This model classifies natural language queries about images into one of four optimization techniques:
| Technique | Token Savings | Best For |
|-----------|--------------|----------|
| `transcode` | ~99% | Text extraction, OCR tasks |
| `crop` | 50-90% | Region-specific queries |
| `full_low` | ~87% | General understanding |
| `preserve` | 0% | Fine details, counting |
## Training Data
- **Base examples**: 145 human-written queries
- **Expanded dataset**: 1,157 examples (via template expansion + synonyms)
- **Split**: 85% train, 15% validation
## Performance
- **Validation Accuracy**: 93.7%
- **Model Size**: ~128MB
### Per-Class Performance
| Class | Precision | Recall | F1-Score |
|-------|-----------|--------|----------|
| transcode | 0.95 | 0.92 | 0.93 |
| crop | 0.92 | 0.97 | 0.94 |
| preserve | 0.97 | 0.90 | 0.93 |
| full_low | 0.89 | 0.96 | 0.92 |
## Usage
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load model
model_id = "chopratejas/technique-router"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)
model.eval()
# Classify a query
query = "What brand is the TV?"
inputs = tokenizer(query, return_tensors="pt", truncation=True, padding=True)
with torch.no_grad():
outputs = model(**inputs)
probs = torch.softmax(outputs.logits, dim=-1)
pred_id = torch.argmax(probs, dim=-1).item()
confidence = probs[0][pred_id].item()
technique = model.config.id2label[pred_id]
print(f"{query} -> {technique} ({confidence:.0%})")
# Output: What brand is the TV? -> preserve (73%)
```
## With Headroom SDK
```python
from headroom.image import TrainedRouter
router = TrainedRouter()
decision = router.classify(image_bytes, "What brand is the TV?")
print(decision.technique) # Technique.PRESERVE
```
## Intended Use
This model is designed for:
- Routing image analysis queries to optimal compression techniques
- Reducing token usage in vision-language model applications
- Enabling cost-effective image understanding at scale
## Limitations
- English language only
- Optimized for common image understanding queries
- May not generalize well to domain-specific terminology
## Citation
```bibtex
@misc{headroom-technique-router,
title={Technique Router for Image Token Optimization},
author={Headroom AI},
year={2025},
publisher={Hugging Face},
url={https://huggingface.co/chopratejas/technique-router}
}
```