ovinduG's picture
Update README.md
be0469a verified
---
license: mit
base_model: microsoft/Phi-3-mini-4k-instruct
tags:
- text-classification
- domain-classification
- phi-3
- lora
- peft
- api-routing
- llm-routing
language:
- en
metrics:
- accuracy
- f1
library_name: peft
pipeline_tag: text-classification
datasets:
- custom
widget:
- text: "Write a Python function to calculate factorial"
example_title: "Coding Query"
- text: "Generate an OpenAPI specification for a user management API"
example_title: "API Generation"
- text: "What is quantum mechanics?"
example_title: "Science Query"
- text: "Analyze sales data to find trends"
example_title: "Data Analysis"
- text: "Write a poem about the ocean"
example_title: "Creative Content"
---
# Phi-3 Domain Classifier for Intelligent API Routing
**🎯 96.5% Accuracy | 15 Domain Categories | Production-Ready**
A fine-tuned Phi-3-mini model for classifying user queries into specific domains, enabling intelligent routing to specialized LLM providers in API management systems.
## πŸš€ Key Features
- βœ… **High Accuracy**: 96.5% on test set
- βœ… **Fast Inference**: ~35-45ms per query
- βœ… **Lightweight**: Only ~100MB LoRA adapters
- βœ… **15 Domains**: Comprehensive coverage
- βœ… **Production-Ready**: Battle-tested on real queries
## πŸ“Š Performance Metrics
| Metric | Score |
|--------|-------|
| **Accuracy** | 96.50% |
| **F1 Score (Weighted)** | 0.9649 |
| **F1 Score (Macro)** | 0.9679 |
| **Precision (Macro)** | 0.97 |
| **Recall (Macro)** | 0.97 |
### Per-Domain Performance
| Domain | Precision | Recall | F1-Score |
|--------|-----------|--------|----------|
| coding | 0.86 | 0.92 | 0.89 |
| api_generation | 1.00 | 0.90 | 0.95 |
| mathematics | 1.00 | 1.00 | 1.00 |
| data_analysis | 0.92 | 1.00 | 0.96 |
| science | 1.00 | 1.00 | 1.00 |
| medicine | 0.93 | 1.00 | 0.96 |
| business | 0.88 | 1.00 | 0.93 |
| law | 0.91 | 1.00 | 0.95 |
| technology | 1.00 | 1.00 | 1.00 |
| literature | 1.00 | 1.00 | 1.00 |
| creative_content | 1.00 | 1.00 | 1.00 |
| education | 1.00 | 0.93 | 0.96 |
| general_knowledge | 1.00 | 0.84 | 0.91 |
| ambiguous | 1.00 | 1.00 | 1.00 |
| sensitive | 1.00 | 1.00 | 1.00 |
## 🎯 Supported Domains
1. **coding** - Programming, algorithms, code generation
2. **api_generation** - OpenAPI specs, API design, REST/GraphQL
3. **mathematics** - Math problems, equations, calculations
4. **data_analysis** - Data science, statistics, analysis
5. **science** - Physics, chemistry, biology, scientific concepts
6. **medicine** - Medical queries, health information
7. **business** - Business strategy, finance, management
8. **law** - Legal questions, regulations, compliance
9. **technology** - Tech concepts, hardware, software
10. **literature** - Books, writing, literary analysis
11. **creative_content** - Creative writing, poetry, storytelling
12. **education** - Teaching, learning, academic topics
13. **general_knowledge** - General Q&A, trivia
14. **ambiguous** - Unclear or multi-domain queries
15. **sensitive** - Sensitive topics requiring careful handling
## πŸ”§ Usage
### Basic Classification
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
import json
# Load model
base_model = AutoModelForCausalLM.from_pretrained(
"microsoft/Phi-3-mini-4k-instruct",
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True
)
model = PeftModel.from_pretrained(
base_model,
"YOUR_USERNAME/phi3-domain-classifier"
)
tokenizer = AutoTokenizer.from_pretrained(
"YOUR_USERNAME/phi3-domain-classifier",
trust_remote_code=True
)
# Configure for inference
model.config.use_cache = False
model.eval()
# Classify a query
def classify_domain(query):
messages = [
{"role": "system", "content": "You are a domain classifier. Respond with JSON."},
{"role": "user", "content": f"Classify this query: {query}"}
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
with torch.no_grad():
outputs = model.generate(
inputs,
max_new_tokens=100,
temperature=0.1,
do_sample=True,
pad_token_id=tokenizer.pad_token_id,
eos_token_id=tokenizer.eos_token_id,
use_cache=False
)
response = tokenizer.decode(
outputs[0][inputs.shape[-1]:],
skip_special_tokens=True
)
return json.loads(response)
# Example
result = classify_domain("Write a Python function to calculate factorial")
print(result)
# Output: {"primary_domain": "coding", "confidence": "high"}
```
### API Router Integration
```python
class SmartAPIRouter:
"""Route queries to specialized LLM providers"""
def __init__(self):
self.classifier = DomainClassifier()
self.provider_mapping = {
"coding": "anthropic", # Claude for code
"api_generation": "anthropic", # Claude for APIs
"mathematics": "anthropic", # Claude for math
"creative_content": "openai", # GPT-4 for creativity
"general_knowledge": "openai", # GPT-4 for general Q&A
# ... customize as needed
}
def route(self, query):
result = self.classifier.classify(query)
domain = result["primary_domain"]
provider = self.provider_mapping.get(domain, "openai")
return {
"domain": domain,
"routed_to": provider,
"confidence": result["confidence"]
}
# Usage
router = SmartAPIRouter()
routing_info = router.route("Explain quantum entanglement")
# Routes to appropriate LLM provider based on domain
```
## πŸ“¦ Model Details
### Architecture
- **Base Model**: microsoft/Phi-3-mini-4k-instruct (3.8B parameters)
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
- **LoRA Rank**: 32
- **LoRA Alpha**: 64
- **Target Modules**: qkv_proj, o_proj, gate_up_proj, down_proj
- **Trainable Parameters**: ~100M (2.6% of total)
### Training Configuration
- **Epochs**: 15
- **Batch Size**: 4 (per device)
- **Gradient Accumulation**: 8 steps (effective batch size: 32)
- **Learning Rate**: 5e-5
- **LR Schedule**: Cosine with 5% warmup
- **Optimizer**: AdamW (fused)
- **Precision**: BF16
- **Label Smoothing**: 0.1
- **Gradient Clipping**: 0.5
### Training Hardware
- **GPU**: NVIDIA A40 (48GB VRAM)
- **Training Time**: ~7 hours
- **Framework**: PyTorch 2.0+ with Transformers
### Training Data
- **Total Samples**: Custom dataset with domain-labeled queries
- **Train/Val/Test Split**: 70/15/15
- **Domains**: 15 categories
- **Format**: Instruction-following with JSON output
## 🎯 Use Cases
### 1. Intelligent API Gateway
Route user queries to the most appropriate LLM provider based on domain expertise.
### 2. Multi-LLM Orchestration
Distribute workload across multiple LLM providers based on their strengths.
### 3. Cost Optimization
Route simple queries to cheaper models, complex queries to premium providers.
### 4. Query Analytics
Analyze and categorize user query patterns for insights.
### 5. Content Moderation
Identify sensitive or ambiguous queries for special handling.
## πŸ”’ Limitations
- **Language**: Optimized for English queries only
- **Context Length**: Limited to 4K tokens (Phi-3-mini constraint)
- **Domain Coverage**: Fixed 15 domains; custom domains require retraining
- **Ambiguous Queries**: May struggle with highly ambiguous or multi-domain queries
- **JSON Output**: Expects structured JSON response; parsing may fail on malformed output
## βš–οΈ Ethical Considerations
- **Bias**: Model may inherit biases from training data
- **Sensitive Content**: Has dedicated "sensitive" category but should not replace human review
- **Privacy**: No personal data used in training; user queries not logged by model
- **Transparency**: Classification decisions are explainable through domain labels
## πŸ“„ License
MIT License - Free for commercial and non-commercial use
## πŸ™ Acknowledgments
- Base model: Microsoft Phi-3 team
- Fine-tuning: HuggingFace PEFT library
- Training infrastructure: NVIDIA A40 GPU
## πŸ“š Citation
If you use this model in your research or application, please cite:
```bibtex
@misc{phi3-domain-classifier,
author = {Your Name},
title = {Phi-3 Domain Classifier for Intelligent API Routing},
year = {2024},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/YOUR_USERNAME/phi3-domain-classifier}},
}
```
## πŸ“ž Contact
For questions, issues, or collaboration:
- **HuggingFace**: [@YOUR_USERNAME](https://huggingface.co/YOUR_USERNAME)
- **GitHub**: [(https://github.com/ovindumandith)]
- **Email**: your.email@example.com
## πŸ”„ Version History
- **v1.0** (2024-12-09): Initial release
- 96.5% accuracy on 15-domain classification
- Production-ready LoRA adapter
- Optimized for API routing use cases
---
**Built using Phi-3 and PEFT**