chopratejas
/

technique-router

Text Classification

image-optimization

technique-routing

text-embeddings-inference

Model card Files Files and versions

technique-router / README.md

chopratejas's picture

Upload folder using huggingface_hub

639f08a verified 3 months ago

|

history blame contribute delete

2.97 kB

	---
	license: apache-2.0
	language:
	- en
	library_name: transformers
	tags:
	- text-classification
	- image-optimization
	- technique-routing
	- headroom
	datasets:
	- custom
	metrics:
	- accuracy
	base_model: microsoft/MiniLM-L12-H384-uncased
	pipeline_tag: text-classification
	---

	# Technique Router (MiniLM)

	A fine-tuned MiniLM classifier that routes image queries to optimal compression techniques for the [Headroom SDK](https://github.com/headroom-ai/headroom).

	## Model Description

	This model classifies natural language queries about images into one of four optimization techniques:

	\| Technique \| Token Savings \| Best For \|
	\|-----------\|--------------\|----------\|
	\| `transcode` \| ~99% \| Text extraction, OCR tasks \|
	\| `crop` \| 50-90% \| Region-specific queries \|
	\| `full_low` \| ~87% \| General understanding \|
	\| `preserve` \| 0% \| Fine details, counting \|

	## Training Data

	- Base examples: 145 human-written queries
	- Expanded dataset: 1,157 examples (via template expansion + synonyms)
	- Split: 85% train, 15% validation

	## Performance

	- Validation Accuracy: 93.7%
	- Model Size: ~128MB

	### Per-Class Performance

	\| Class \| Precision \| Recall \| F1-Score \|
	\|-------\|-----------\|--------\|----------\|
	\| transcode \| 0.95 \| 0.92 \| 0.93 \|
	\| crop \| 0.92 \| 0.97 \| 0.94 \|
	\| preserve \| 0.97 \| 0.90 \| 0.93 \|
	\| full_low \| 0.89 \| 0.96 \| 0.92 \|

	## Usage

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	# Load model
	model_id = "chopratejas/technique-router"
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForSequenceClassification.from_pretrained(model_id)
	model.eval()

	# Classify a query
	query = "What brand is the TV?"
	inputs = tokenizer(query, return_tensors="pt", truncation=True, padding=True)

	with torch.no_grad():
	outputs = model(**inputs)
	probs = torch.softmax(outputs.logits, dim=-1)
	pred_id = torch.argmax(probs, dim=-1).item()
	confidence = probs[0][pred_id].item()

	technique = model.config.id2label[pred_id]
	print(f"{query} -> {technique} ({confidence:.0%})")
	# Output: What brand is the TV? -> preserve (73%)
	```

	## With Headroom SDK

	```python
	from headroom.image import TrainedRouter

	router = TrainedRouter()
	decision = router.classify(image_bytes, "What brand is the TV?")
	print(decision.technique) # Technique.PRESERVE
	```

	## Intended Use

	This model is designed for:
	- Routing image analysis queries to optimal compression techniques
	- Reducing token usage in vision-language model applications
	- Enabling cost-effective image understanding at scale

	## Limitations

	- English language only
	- Optimized for common image understanding queries
	- May not generalize well to domain-specific terminology

	## Citation

	```bibtex
	@misc{headroom-technique-router,
	title={Technique Router for Image Token Optimization},
	author={Headroom AI},
	year={2025},
	publisher={Hugging Face},
	url={https://huggingface.co/chopratejas/technique-router}
	}
	```