Phi-3 Domain Classifier for Intelligent API Routing

🎯 96.5% Accuracy | 15 Domain Categories | Production-Ready

A fine-tuned Phi-3-mini model for classifying user queries into specific domains, enabling intelligent routing to specialized LLM providers in API management systems.

🚀 Key Features

✅ High Accuracy: 96.5% on test set
✅ Fast Inference: ~35-45ms per query
✅ Lightweight: Only ~100MB LoRA adapters
✅ 15 Domains: Comprehensive coverage
✅ Production-Ready: Battle-tested on real queries

📊 Performance Metrics

Metric	Score
Accuracy	96.50%
F1 Score (Weighted)	0.9649
F1 Score (Macro)	0.9679
Precision (Macro)	0.97
Recall (Macro)	0.97

Per-Domain Performance

Domain	Precision	Recall	F1-Score
coding	0.86	0.92	0.89
api_generation	1.00	0.90	0.95
mathematics	1.00	1.00	1.00
data_analysis	0.92	1.00	0.96
science	1.00	1.00	1.00
medicine	0.93	1.00	0.96
business	0.88	1.00	0.93
law	0.91	1.00	0.95
technology	1.00	1.00	1.00
literature	1.00	1.00	1.00
creative_content	1.00	1.00	1.00
education	1.00	0.93	0.96
general_knowledge	1.00	0.84	0.91
ambiguous	1.00	1.00	1.00
sensitive	1.00	1.00	1.00

🎯 Supported Domains

coding - Programming, algorithms, code generation
api_generation - OpenAPI specs, API design, REST/GraphQL
mathematics - Math problems, equations, calculations
data_analysis - Data science, statistics, analysis
science - Physics, chemistry, biology, scientific concepts
medicine - Medical queries, health information
business - Business strategy, finance, management
law - Legal questions, regulations, compliance
technology - Tech concepts, hardware, software
literature - Books, writing, literary analysis
creative_content - Creative writing, poetry, storytelling
education - Teaching, learning, academic topics
general_knowledge - General Q&A, trivia
ambiguous - Unclear or multi-domain queries
sensitive - Sensitive topics requiring careful handling

🔧 Usage

Basic Classification

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
import json

# Load model
base_model = AutoModelForCausalLM.from_pretrained(
    "microsoft/Phi-3-mini-4k-instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

model = PeftModel.from_pretrained(
    base_model,
    "YOUR_USERNAME/phi3-domain-classifier"
)

tokenizer = AutoTokenizer.from_pretrained(
    "YOUR_USERNAME/phi3-domain-classifier",
    trust_remote_code=True
)

# Configure for inference
model.config.use_cache = False
model.eval()

# Classify a query
def classify_domain(query):
    messages = [
        {"role": "system", "content": "You are a domain classifier. Respond with JSON."},
        {"role": "user", "content": f"Classify this query: {query}"}
    ]
    
    inputs = tokenizer.apply_chat_template(
        messages,
        add_generation_prompt=True,
        return_tensors="pt"
    ).to(model.device)
    
    with torch.no_grad():
        outputs = model.generate(
            inputs,
            max_new_tokens=100,
            temperature=0.1,
            do_sample=True,
            pad_token_id=tokenizer.pad_token_id,
            eos_token_id=tokenizer.eos_token_id,
            use_cache=False
        )
    
    response = tokenizer.decode(
        outputs[0][inputs.shape[-1]:], 
        skip_special_tokens=True
    )
    
    return json.loads(response)

# Example
result = classify_domain("Write a Python function to calculate factorial")
print(result)
# Output: {"primary_domain": "coding", "confidence": "high"}

API Router Integration

class SmartAPIRouter:
    """Route queries to specialized LLM providers"""
    
    def __init__(self):
        self.classifier = DomainClassifier()
        self.provider_mapping = {
            "coding": "anthropic",           # Claude for code
            "api_generation": "anthropic",   # Claude for APIs
            "mathematics": "anthropic",      # Claude for math
            "creative_content": "openai",    # GPT-4 for creativity
            "general_knowledge": "openai",   # GPT-4 for general Q&A
            # ... customize as needed
        }
    
    def route(self, query):
        result = self.classifier.classify(query)
        domain = result["primary_domain"]
        provider = self.provider_mapping.get(domain, "openai")
        
        return {
            "domain": domain,
            "routed_to": provider,
            "confidence": result["confidence"]
        }

# Usage
router = SmartAPIRouter()
routing_info = router.route("Explain quantum entanglement")
# Routes to appropriate LLM provider based on domain

📦 Model Details

Architecture

Base Model: microsoft/Phi-3-mini-4k-instruct (3.8B parameters)
Fine-tuning Method: LoRA (Low-Rank Adaptation)
LoRA Rank: 32
LoRA Alpha: 64
Target Modules: qkv_proj, o_proj, gate_up_proj, down_proj
Trainable Parameters: ~100M (2.6% of total)

Training Configuration

Epochs: 15
Batch Size: 4 (per device)
Gradient Accumulation: 8 steps (effective batch size: 32)
Learning Rate: 5e-5
LR Schedule: Cosine with 5% warmup
Optimizer: AdamW (fused)
Precision: BF16
Label Smoothing: 0.1
Gradient Clipping: 0.5

Training Hardware

GPU: NVIDIA A40 (48GB VRAM)
Training Time: ~7 hours
Framework: PyTorch 2.0+ with Transformers

Training Data

Total Samples: Custom dataset with domain-labeled queries
Train/Val/Test Split: 70/15/15
Domains: 15 categories
Format: Instruction-following with JSON output

🎯 Use Cases

1. Intelligent API Gateway

Route user queries to the most appropriate LLM provider based on domain expertise.

2. Multi-LLM Orchestration

Distribute workload across multiple LLM providers based on their strengths.

3. Cost Optimization

Route simple queries to cheaper models, complex queries to premium providers.

4. Query Analytics

Analyze and categorize user query patterns for insights.

5. Content Moderation

Identify sensitive or ambiguous queries for special handling.

🔒 Limitations

Language: Optimized for English queries only
Context Length: Limited to 4K tokens (Phi-3-mini constraint)
Domain Coverage: Fixed 15 domains; custom domains require retraining
Ambiguous Queries: May struggle with highly ambiguous or multi-domain queries
JSON Output: Expects structured JSON response; parsing may fail on malformed output

⚖️ Ethical Considerations

Bias: Model may inherit biases from training data
Sensitive Content: Has dedicated "sensitive" category but should not replace human review
Privacy: No personal data used in training; user queries not logged by model
Transparency: Classification decisions are explainable through domain labels

📄 License

MIT License - Free for commercial and non-commercial use

🙏 Acknowledgments

Base model: Microsoft Phi-3 team
Fine-tuning: HuggingFace PEFT library
Training infrastructure: NVIDIA A40 GPU

📚 Citation

If you use this model in your research or application, please cite:

@misc{phi3-domain-classifier,
  author = {Your Name},
  title = {Phi-3 Domain Classifier for Intelligent API Routing},
  year = {2024},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/YOUR_USERNAME/phi3-domain-classifier}},
}

📞 Contact

For questions, issues, or collaboration:

HuggingFace: @YOUR_USERNAME
GitHub: [(https://github.com/ovindumandith)]
Email: your.email@example.com

🔄 Version History

v1.0 (2024-12-09): Initial release
- 96.5% accuracy on 15-domain classification
- Production-ready LoRA adapter
- Optimized for API routing use cases

Built using Phi-3 and PEFT

Downloads last month: 6

Safetensors

Model size

4B params

Tensor type

BF16

Model tree for ovinduG/phi3-domain-classifier

Base model

microsoft/Phi-3-mini-4k-instruct

Adapter

(834)

this model