|
|
--- |
|
|
license: mit |
|
|
base_model: microsoft/Phi-3-mini-4k-instruct |
|
|
tags: |
|
|
- text-classification |
|
|
- domain-classification |
|
|
- phi-3 |
|
|
- lora |
|
|
- peft |
|
|
- api-routing |
|
|
- llm-routing |
|
|
language: |
|
|
- en |
|
|
metrics: |
|
|
- accuracy |
|
|
- f1 |
|
|
library_name: peft |
|
|
pipeline_tag: text-classification |
|
|
datasets: |
|
|
- custom |
|
|
widget: |
|
|
- text: "Write a Python function to calculate factorial" |
|
|
example_title: "Coding Query" |
|
|
- text: "Generate an OpenAPI specification for a user management API" |
|
|
example_title: "API Generation" |
|
|
- text: "What is quantum mechanics?" |
|
|
example_title: "Science Query" |
|
|
- text: "Analyze sales data to find trends" |
|
|
example_title: "Data Analysis" |
|
|
- text: "Write a poem about the ocean" |
|
|
example_title: "Creative Content" |
|
|
--- |
|
|
|
|
|
# Phi-3 Domain Classifier for Intelligent API Routing |
|
|
|
|
|
**π― 96.5% Accuracy | 15 Domain Categories | Production-Ready** |
|
|
|
|
|
A fine-tuned Phi-3-mini model for classifying user queries into specific domains, enabling intelligent routing to specialized LLM providers in API management systems. |
|
|
|
|
|
## π Key Features |
|
|
|
|
|
- β
**High Accuracy**: 96.5% on test set |
|
|
- β
**Fast Inference**: ~35-45ms per query |
|
|
- β
**Lightweight**: Only ~100MB LoRA adapters |
|
|
- β
**15 Domains**: Comprehensive coverage |
|
|
- β
**Production-Ready**: Battle-tested on real queries |
|
|
|
|
|
## π Performance Metrics |
|
|
|
|
|
| Metric | Score | |
|
|
|--------|-------| |
|
|
| **Accuracy** | 96.50% | |
|
|
| **F1 Score (Weighted)** | 0.9649 | |
|
|
| **F1 Score (Macro)** | 0.9679 | |
|
|
| **Precision (Macro)** | 0.97 | |
|
|
| **Recall (Macro)** | 0.97 | |
|
|
|
|
|
### Per-Domain Performance |
|
|
|
|
|
| Domain | Precision | Recall | F1-Score | |
|
|
|--------|-----------|--------|----------| |
|
|
| coding | 0.86 | 0.92 | 0.89 | |
|
|
| api_generation | 1.00 | 0.90 | 0.95 | |
|
|
| mathematics | 1.00 | 1.00 | 1.00 | |
|
|
| data_analysis | 0.92 | 1.00 | 0.96 | |
|
|
| science | 1.00 | 1.00 | 1.00 | |
|
|
| medicine | 0.93 | 1.00 | 0.96 | |
|
|
| business | 0.88 | 1.00 | 0.93 | |
|
|
| law | 0.91 | 1.00 | 0.95 | |
|
|
| technology | 1.00 | 1.00 | 1.00 | |
|
|
| literature | 1.00 | 1.00 | 1.00 | |
|
|
| creative_content | 1.00 | 1.00 | 1.00 | |
|
|
| education | 1.00 | 0.93 | 0.96 | |
|
|
| general_knowledge | 1.00 | 0.84 | 0.91 | |
|
|
| ambiguous | 1.00 | 1.00 | 1.00 | |
|
|
| sensitive | 1.00 | 1.00 | 1.00 | |
|
|
|
|
|
## π― Supported Domains |
|
|
|
|
|
1. **coding** - Programming, algorithms, code generation |
|
|
2. **api_generation** - OpenAPI specs, API design, REST/GraphQL |
|
|
3. **mathematics** - Math problems, equations, calculations |
|
|
4. **data_analysis** - Data science, statistics, analysis |
|
|
5. **science** - Physics, chemistry, biology, scientific concepts |
|
|
6. **medicine** - Medical queries, health information |
|
|
7. **business** - Business strategy, finance, management |
|
|
8. **law** - Legal questions, regulations, compliance |
|
|
9. **technology** - Tech concepts, hardware, software |
|
|
10. **literature** - Books, writing, literary analysis |
|
|
11. **creative_content** - Creative writing, poetry, storytelling |
|
|
12. **education** - Teaching, learning, academic topics |
|
|
13. **general_knowledge** - General Q&A, trivia |
|
|
14. **ambiguous** - Unclear or multi-domain queries |
|
|
15. **sensitive** - Sensitive topics requiring careful handling |
|
|
|
|
|
## π§ Usage |
|
|
|
|
|
### Basic Classification |
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
from peft import PeftModel |
|
|
import torch |
|
|
import json |
|
|
|
|
|
# Load model |
|
|
base_model = AutoModelForCausalLM.from_pretrained( |
|
|
"microsoft/Phi-3-mini-4k-instruct", |
|
|
torch_dtype=torch.bfloat16, |
|
|
device_map="auto", |
|
|
trust_remote_code=True |
|
|
) |
|
|
|
|
|
model = PeftModel.from_pretrained( |
|
|
base_model, |
|
|
"YOUR_USERNAME/phi3-domain-classifier" |
|
|
) |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained( |
|
|
"YOUR_USERNAME/phi3-domain-classifier", |
|
|
trust_remote_code=True |
|
|
) |
|
|
|
|
|
# Configure for inference |
|
|
model.config.use_cache = False |
|
|
model.eval() |
|
|
|
|
|
# Classify a query |
|
|
def classify_domain(query): |
|
|
messages = [ |
|
|
{"role": "system", "content": "You are a domain classifier. Respond with JSON."}, |
|
|
{"role": "user", "content": f"Classify this query: {query}"} |
|
|
] |
|
|
|
|
|
inputs = tokenizer.apply_chat_template( |
|
|
messages, |
|
|
add_generation_prompt=True, |
|
|
return_tensors="pt" |
|
|
).to(model.device) |
|
|
|
|
|
with torch.no_grad(): |
|
|
outputs = model.generate( |
|
|
inputs, |
|
|
max_new_tokens=100, |
|
|
temperature=0.1, |
|
|
do_sample=True, |
|
|
pad_token_id=tokenizer.pad_token_id, |
|
|
eos_token_id=tokenizer.eos_token_id, |
|
|
use_cache=False |
|
|
) |
|
|
|
|
|
response = tokenizer.decode( |
|
|
outputs[0][inputs.shape[-1]:], |
|
|
skip_special_tokens=True |
|
|
) |
|
|
|
|
|
return json.loads(response) |
|
|
|
|
|
# Example |
|
|
result = classify_domain("Write a Python function to calculate factorial") |
|
|
print(result) |
|
|
# Output: {"primary_domain": "coding", "confidence": "high"} |
|
|
``` |
|
|
|
|
|
### API Router Integration |
|
|
```python |
|
|
class SmartAPIRouter: |
|
|
"""Route queries to specialized LLM providers""" |
|
|
|
|
|
def __init__(self): |
|
|
self.classifier = DomainClassifier() |
|
|
self.provider_mapping = { |
|
|
"coding": "anthropic", # Claude for code |
|
|
"api_generation": "anthropic", # Claude for APIs |
|
|
"mathematics": "anthropic", # Claude for math |
|
|
"creative_content": "openai", # GPT-4 for creativity |
|
|
"general_knowledge": "openai", # GPT-4 for general Q&A |
|
|
# ... customize as needed |
|
|
} |
|
|
|
|
|
def route(self, query): |
|
|
result = self.classifier.classify(query) |
|
|
domain = result["primary_domain"] |
|
|
provider = self.provider_mapping.get(domain, "openai") |
|
|
|
|
|
return { |
|
|
"domain": domain, |
|
|
"routed_to": provider, |
|
|
"confidence": result["confidence"] |
|
|
} |
|
|
|
|
|
# Usage |
|
|
router = SmartAPIRouter() |
|
|
routing_info = router.route("Explain quantum entanglement") |
|
|
# Routes to appropriate LLM provider based on domain |
|
|
``` |
|
|
|
|
|
## π¦ Model Details |
|
|
|
|
|
### Architecture |
|
|
|
|
|
- **Base Model**: microsoft/Phi-3-mini-4k-instruct (3.8B parameters) |
|
|
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation) |
|
|
- **LoRA Rank**: 32 |
|
|
- **LoRA Alpha**: 64 |
|
|
- **Target Modules**: qkv_proj, o_proj, gate_up_proj, down_proj |
|
|
- **Trainable Parameters**: ~100M (2.6% of total) |
|
|
|
|
|
### Training Configuration |
|
|
|
|
|
- **Epochs**: 15 |
|
|
- **Batch Size**: 4 (per device) |
|
|
- **Gradient Accumulation**: 8 steps (effective batch size: 32) |
|
|
- **Learning Rate**: 5e-5 |
|
|
- **LR Schedule**: Cosine with 5% warmup |
|
|
- **Optimizer**: AdamW (fused) |
|
|
- **Precision**: BF16 |
|
|
- **Label Smoothing**: 0.1 |
|
|
- **Gradient Clipping**: 0.5 |
|
|
|
|
|
### Training Hardware |
|
|
|
|
|
- **GPU**: NVIDIA A40 (48GB VRAM) |
|
|
- **Training Time**: ~7 hours |
|
|
- **Framework**: PyTorch 2.0+ with Transformers |
|
|
|
|
|
### Training Data |
|
|
|
|
|
- **Total Samples**: Custom dataset with domain-labeled queries |
|
|
- **Train/Val/Test Split**: 70/15/15 |
|
|
- **Domains**: 15 categories |
|
|
- **Format**: Instruction-following with JSON output |
|
|
|
|
|
## π― Use Cases |
|
|
|
|
|
### 1. Intelligent API Gateway |
|
|
Route user queries to the most appropriate LLM provider based on domain expertise. |
|
|
|
|
|
### 2. Multi-LLM Orchestration |
|
|
Distribute workload across multiple LLM providers based on their strengths. |
|
|
|
|
|
### 3. Cost Optimization |
|
|
Route simple queries to cheaper models, complex queries to premium providers. |
|
|
|
|
|
### 4. Query Analytics |
|
|
Analyze and categorize user query patterns for insights. |
|
|
|
|
|
### 5. Content Moderation |
|
|
Identify sensitive or ambiguous queries for special handling. |
|
|
|
|
|
## π Limitations |
|
|
|
|
|
- **Language**: Optimized for English queries only |
|
|
- **Context Length**: Limited to 4K tokens (Phi-3-mini constraint) |
|
|
- **Domain Coverage**: Fixed 15 domains; custom domains require retraining |
|
|
- **Ambiguous Queries**: May struggle with highly ambiguous or multi-domain queries |
|
|
- **JSON Output**: Expects structured JSON response; parsing may fail on malformed output |
|
|
|
|
|
## βοΈ Ethical Considerations |
|
|
|
|
|
- **Bias**: Model may inherit biases from training data |
|
|
- **Sensitive Content**: Has dedicated "sensitive" category but should not replace human review |
|
|
- **Privacy**: No personal data used in training; user queries not logged by model |
|
|
- **Transparency**: Classification decisions are explainable through domain labels |
|
|
|
|
|
## π License |
|
|
|
|
|
MIT License - Free for commercial and non-commercial use |
|
|
|
|
|
## π Acknowledgments |
|
|
|
|
|
- Base model: Microsoft Phi-3 team |
|
|
- Fine-tuning: HuggingFace PEFT library |
|
|
- Training infrastructure: NVIDIA A40 GPU |
|
|
|
|
|
## π Citation |
|
|
|
|
|
If you use this model in your research or application, please cite: |
|
|
```bibtex |
|
|
@misc{phi3-domain-classifier, |
|
|
author = {Your Name}, |
|
|
title = {Phi-3 Domain Classifier for Intelligent API Routing}, |
|
|
year = {2024}, |
|
|
publisher = {HuggingFace}, |
|
|
howpublished = {\url{https://huggingface.co/YOUR_USERNAME/phi3-domain-classifier}}, |
|
|
} |
|
|
``` |
|
|
|
|
|
## π Contact |
|
|
|
|
|
For questions, issues, or collaboration: |
|
|
- **HuggingFace**: [@YOUR_USERNAME](https://huggingface.co/YOUR_USERNAME) |
|
|
- **GitHub**: [(https://github.com/ovindumandith)] |
|
|
- **Email**: your.email@example.com |
|
|
|
|
|
## π Version History |
|
|
|
|
|
- **v1.0** (2024-12-09): Initial release |
|
|
- 96.5% accuracy on 15-domain classification |
|
|
- Production-ready LoRA adapter |
|
|
- Optimized for API routing use cases |
|
|
|
|
|
--- |
|
|
|
|
|
**Built using Phi-3 and PEFT** |