--- license: apache-2.0 language: - en library_name: transformers tags: - text-classification - image-optimization - technique-routing - headroom datasets: - custom metrics: - accuracy base_model: microsoft/MiniLM-L12-H384-uncased pipeline_tag: text-classification --- # Technique Router (MiniLM) A fine-tuned MiniLM classifier that routes image queries to optimal compression techniques for the [Headroom SDK](https://github.com/headroom-ai/headroom). ## Model Description This model classifies natural language queries about images into one of four optimization techniques: | Technique | Token Savings | Best For | |-----------|--------------|----------| | `transcode` | ~99% | Text extraction, OCR tasks | | `crop` | 50-90% | Region-specific queries | | `full_low` | ~87% | General understanding | | `preserve` | 0% | Fine details, counting | ## Training Data - **Base examples**: 145 human-written queries - **Expanded dataset**: 1,157 examples (via template expansion + synonyms) - **Split**: 85% train, 15% validation ## Performance - **Validation Accuracy**: 93.7% - **Model Size**: ~128MB ### Per-Class Performance | Class | Precision | Recall | F1-Score | |-------|-----------|--------|----------| | transcode | 0.95 | 0.92 | 0.93 | | crop | 0.92 | 0.97 | 0.94 | | preserve | 0.97 | 0.90 | 0.93 | | full_low | 0.89 | 0.96 | 0.92 | ## Usage ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch # Load model model_id = "chopratejas/technique-router" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForSequenceClassification.from_pretrained(model_id) model.eval() # Classify a query query = "What brand is the TV?" inputs = tokenizer(query, return_tensors="pt", truncation=True, padding=True) with torch.no_grad(): outputs = model(**inputs) probs = torch.softmax(outputs.logits, dim=-1) pred_id = torch.argmax(probs, dim=-1).item() confidence = probs[0][pred_id].item() technique = model.config.id2label[pred_id] print(f"{query} -> {technique} ({confidence:.0%})") # Output: What brand is the TV? -> preserve (73%) ``` ## With Headroom SDK ```python from headroom.image import TrainedRouter router = TrainedRouter() decision = router.classify(image_bytes, "What brand is the TV?") print(decision.technique) # Technique.PRESERVE ``` ## Intended Use This model is designed for: - Routing image analysis queries to optimal compression techniques - Reducing token usage in vision-language model applications - Enabling cost-effective image understanding at scale ## Limitations - English language only - Optimized for common image understanding queries - May not generalize well to domain-specific terminology ## Citation ```bibtex @misc{headroom-technique-router, title={Technique Router for Image Token Optimization}, author={Headroom AI}, year={2025}, publisher={Hugging Face}, url={https://huggingface.co/chopratejas/technique-router} } ```