File size: 9,016 Bytes

---
license: mit
base_model: microsoft/Phi-3-mini-4k-instruct
tags:
- text-classification
- domain-classification
- phi-3
- lora
- peft
- api-routing
- llm-routing
language:
- en
metrics:
- accuracy
- f1
library_name: peft
pipeline_tag: text-classification
datasets:
- custom
widget:
- text: "Write a Python function to calculate factorial"
  example_title: "Coding Query"
- text: "Generate an OpenAPI specification for a user management API"
  example_title: "API Generation"
- text: "What is quantum mechanics?"
  example_title: "Science Query"
- text: "Analyze sales data to find trends"
  example_title: "Data Analysis"
- text: "Write a poem about the ocean"
  example_title: "Creative Content"
---

# Phi-3 Domain Classifier for Intelligent API Routing

**🎯 96.5% Accuracy | 15 Domain Categories | Production-Ready**

A fine-tuned Phi-3-mini model for classifying user queries into specific domains, enabling intelligent routing to specialized LLM providers in API management systems.

## 🚀 Key Features

- ✅ **High Accuracy**: 96.5% on test set
- ✅ **Fast Inference**: ~35-45ms per query
- ✅ **Lightweight**: Only ~100MB LoRA adapters
- ✅ **15 Domains**: Comprehensive coverage
- ✅ **Production-Ready**: Battle-tested on real queries

## 📊 Performance Metrics

| Metric | Score |
|--------|-------|
| **Accuracy** | 96.50% |
| **F1 Score (Weighted)** | 0.9649 |
| **F1 Score (Macro)** | 0.9679 |
| **Precision (Macro)** | 0.97 |
| **Recall (Macro)** | 0.97 |

### Per-Domain Performance

| Domain | Precision | Recall | F1-Score |
|--------|-----------|--------|----------|
| coding | 0.86 | 0.92 | 0.89 |
| api_generation | 1.00 | 0.90 | 0.95 |
| mathematics | 1.00 | 1.00 | 1.00 |
| data_analysis | 0.92 | 1.00 | 0.96 |
| science | 1.00 | 1.00 | 1.00 |
| medicine | 0.93 | 1.00 | 0.96 |
| business | 0.88 | 1.00 | 0.93 |
| law | 0.91 | 1.00 | 0.95 |
| technology | 1.00 | 1.00 | 1.00 |
| literature | 1.00 | 1.00 | 1.00 |
| creative_content | 1.00 | 1.00 | 1.00 |
| education | 1.00 | 0.93 | 0.96 |
| general_knowledge | 1.00 | 0.84 | 0.91 |
| ambiguous | 1.00 | 1.00 | 1.00 |
| sensitive | 1.00 | 1.00 | 1.00 |

## 🎯 Supported Domains

1. **coding** - Programming, algorithms, code generation
2. **api_generation** - OpenAPI specs, API design, REST/GraphQL
3. **mathematics** - Math problems, equations, calculations
4. **data_analysis** - Data science, statistics, analysis
5. **science** - Physics, chemistry, biology, scientific concepts
6. **medicine** - Medical queries, health information
7. **business** - Business strategy, finance, management
8. **law** - Legal questions, regulations, compliance
9. **technology** - Tech concepts, hardware, software
10. **literature** - Books, writing, literary analysis
11. **creative_content** - Creative writing, poetry, storytelling
12. **education** - Teaching, learning, academic topics
13. **general_knowledge** - General Q&A, trivia
14. **ambiguous** - Unclear or multi-domain queries
15. **sensitive** - Sensitive topics requiring careful handling

## 🔧 Usage

### Basic Classification
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
import json

# Load model
base_model = AutoModelForCausalLM.from_pretrained(
    "microsoft/Phi-3-mini-4k-instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

model = PeftModel.from_pretrained(
    base_model,
    "YOUR_USERNAME/phi3-domain-classifier"
)

tokenizer = AutoTokenizer.from_pretrained(
    "YOUR_USERNAME/phi3-domain-classifier",
    trust_remote_code=True
)

# Configure for inference
model.config.use_cache = False
model.eval()

# Classify a query
def classify_domain(query):
    messages = [
        {"role": "system", "content": "You are a domain classifier. Respond with JSON."},
        {"role": "user", "content": f"Classify this query: {query}"}
    ]
    
    inputs = tokenizer.apply_chat_template(
        messages,
        add_generation_prompt=True,
        return_tensors="pt"
    ).to(model.device)
    
    with torch.no_grad():
        outputs = model.generate(
            inputs,
            max_new_tokens=100,
            temperature=0.1,
            do_sample=True,
            pad_token_id=tokenizer.pad_token_id,
            eos_token_id=tokenizer.eos_token_id,
            use_cache=False
        )
    
    response = tokenizer.decode(
        outputs[0][inputs.shape[-1]:], 
        skip_special_tokens=True
    )
    
    return json.loads(response)

# Example
result = classify_domain("Write a Python function to calculate factorial")
print(result)
# Output: {"primary_domain": "coding", "confidence": "high"}
```

### API Router Integration
```python
class SmartAPIRouter:
    """Route queries to specialized LLM providers"""
    
    def __init__(self):
        self.classifier = DomainClassifier()
        self.provider_mapping = {
            "coding": "anthropic",           # Claude for code
            "api_generation": "anthropic",   # Claude for APIs
            "mathematics": "anthropic",      # Claude for math
            "creative_content": "openai",    # GPT-4 for creativity
            "general_knowledge": "openai",   # GPT-4 for general Q&A
            # ... customize as needed
        }
    
    def route(self, query):
        result = self.classifier.classify(query)
        domain = result["primary_domain"]
        provider = self.provider_mapping.get(domain, "openai")
        
        return {
            "domain": domain,
            "routed_to": provider,
            "confidence": result["confidence"]
        }

# Usage
router = SmartAPIRouter()
routing_info = router.route("Explain quantum entanglement")
# Routes to appropriate LLM provider based on domain
```

## 📦 Model Details

### Architecture

- **Base Model**: microsoft/Phi-3-mini-4k-instruct (3.8B parameters)
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
- **LoRA Rank**: 32
- **LoRA Alpha**: 64
- **Target Modules**: qkv_proj, o_proj, gate_up_proj, down_proj
- **Trainable Parameters**: ~100M (2.6% of total)

### Training Configuration

- **Epochs**: 15
- **Batch Size**: 4 (per device)
- **Gradient Accumulation**: 8 steps (effective batch size: 32)
- **Learning Rate**: 5e-5
- **LR Schedule**: Cosine with 5% warmup
- **Optimizer**: AdamW (fused)
- **Precision**: BF16
- **Label Smoothing**: 0.1
- **Gradient Clipping**: 0.5

### Training Hardware

- **GPU**: NVIDIA A40 (48GB VRAM)
- **Training Time**: ~7 hours
- **Framework**: PyTorch 2.0+ with Transformers

### Training Data

- **Total Samples**: Custom dataset with domain-labeled queries
- **Train/Val/Test Split**: 70/15/15
- **Domains**: 15 categories
- **Format**: Instruction-following with JSON output

## 🎯 Use Cases

### 1. Intelligent API Gateway
Route user queries to the most appropriate LLM provider based on domain expertise.

### 2. Multi-LLM Orchestration
Distribute workload across multiple LLM providers based on their strengths.

### 3. Cost Optimization
Route simple queries to cheaper models, complex queries to premium providers.

### 4. Query Analytics
Analyze and categorize user query patterns for insights.

### 5. Content Moderation
Identify sensitive or ambiguous queries for special handling.

## 🔒 Limitations

- **Language**: Optimized for English queries only
- **Context Length**: Limited to 4K tokens (Phi-3-mini constraint)
- **Domain Coverage**: Fixed 15 domains; custom domains require retraining
- **Ambiguous Queries**: May struggle with highly ambiguous or multi-domain queries
- **JSON Output**: Expects structured JSON response; parsing may fail on malformed output

## ⚖️ Ethical Considerations

- **Bias**: Model may inherit biases from training data
- **Sensitive Content**: Has dedicated "sensitive" category but should not replace human review
- **Privacy**: No personal data used in training; user queries not logged by model
- **Transparency**: Classification decisions are explainable through domain labels

## 📄 License

MIT License - Free for commercial and non-commercial use

## 🙏 Acknowledgments

- Base model: Microsoft Phi-3 team
- Fine-tuning: HuggingFace PEFT library
- Training infrastructure: NVIDIA A40 GPU

## 📚 Citation

If you use this model in your research or application, please cite:
```bibtex
@misc{phi3-domain-classifier,
  author = {Your Name},
  title = {Phi-3 Domain Classifier for Intelligent API Routing},
  year = {2024},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/YOUR_USERNAME/phi3-domain-classifier}},
}
```

## 📞 Contact

For questions, issues, or collaboration:
- **HuggingFace**: [@YOUR_USERNAME](https://huggingface.co/YOUR_USERNAME)
- **GitHub**: [(https://github.com/ovindumandith)]
- **Email**: your.email@example.com

## 🔄 Version History

- **v1.0** (2024-12-09): Initial release
  - 96.5% accuracy on 15-domain classification
  - Production-ready LoRA adapter
  - Optimized for API routing use cases

---

**Built using Phi-3 and PEFT**