Update README.md

be0469a verified 4 days ago

9.02 kB

	---
	license: mit
	base_model: microsoft/Phi-3-mini-4k-instruct
	tags:
	- text-classification
	- domain-classification
	- phi-3
	- lora
	- peft
	- api-routing
	- llm-routing
	language:
	- en
	metrics:
	- accuracy
	- f1
	library_name: peft
	pipeline_tag: text-classification
	datasets:
	- custom
	widget:
	- text: "Write a Python function to calculate factorial"
	example_title: "Coding Query"
	- text: "Generate an OpenAPI specification for a user management API"
	example_title: "API Generation"
	- text: "What is quantum mechanics?"
	example_title: "Science Query"
	- text: "Analyze sales data to find trends"
	example_title: "Data Analysis"
	- text: "Write a poem about the ocean"
	example_title: "Creative Content"
	---

	# Phi-3 Domain Classifier for Intelligent API Routing

	🎯 96.5% Accuracy \| 15 Domain Categories \| Production-Ready

	A fine-tuned Phi-3-mini model for classifying user queries into specific domains, enabling intelligent routing to specialized LLM providers in API management systems.

	## 🚀 Key Features

	- ✅ High Accuracy: 96.5% on test set
	- ✅ Fast Inference: ~35-45ms per query
	- ✅ Lightweight: Only ~100MB LoRA adapters
	- ✅ 15 Domains: Comprehensive coverage
	- ✅ Production-Ready: Battle-tested on real queries

	## 📊 Performance Metrics

	\| Metric \| Score \|
	\|--------\|-------\|
	\| Accuracy \| 96.50% \|
	\| F1 Score (Weighted) \| 0.9649 \|
	\| F1 Score (Macro) \| 0.9679 \|
	\| Precision (Macro) \| 0.97 \|
	\| Recall (Macro) \| 0.97 \|

	### Per-Domain Performance

	\| Domain \| Precision \| Recall \| F1-Score \|
	\|--------\|-----------\|--------\|----------\|
	\| coding \| 0.86 \| 0.92 \| 0.89 \|
	\| api_generation \| 1.00 \| 0.90 \| 0.95 \|
	\| mathematics \| 1.00 \| 1.00 \| 1.00 \|
	\| data_analysis \| 0.92 \| 1.00 \| 0.96 \|
	\| science \| 1.00 \| 1.00 \| 1.00 \|
	\| medicine \| 0.93 \| 1.00 \| 0.96 \|
	\| business \| 0.88 \| 1.00 \| 0.93 \|
	\| law \| 0.91 \| 1.00 \| 0.95 \|
	\| technology \| 1.00 \| 1.00 \| 1.00 \|
	\| literature \| 1.00 \| 1.00 \| 1.00 \|
	\| creative_content \| 1.00 \| 1.00 \| 1.00 \|
	\| education \| 1.00 \| 0.93 \| 0.96 \|
	\| general_knowledge \| 1.00 \| 0.84 \| 0.91 \|
	\| ambiguous \| 1.00 \| 1.00 \| 1.00 \|
	\| sensitive \| 1.00 \| 1.00 \| 1.00 \|

	## 🎯 Supported Domains

	1. coding - Programming, algorithms, code generation
	2. api_generation - OpenAPI specs, API design, REST/GraphQL
	3. mathematics - Math problems, equations, calculations
	4. data_analysis - Data science, statistics, analysis
	5. science - Physics, chemistry, biology, scientific concepts
	6. medicine - Medical queries, health information
	7. business - Business strategy, finance, management
	8. law - Legal questions, regulations, compliance
	9. technology - Tech concepts, hardware, software
	10. literature - Books, writing, literary analysis
	11. creative_content - Creative writing, poetry, storytelling
	12. education - Teaching, learning, academic topics
	13. general_knowledge - General Q&A, trivia
	14. ambiguous - Unclear or multi-domain queries
	15. sensitive - Sensitive topics requiring careful handling

	## 🔧 Usage

	### Basic Classification
	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from peft import PeftModel
	import torch
	import json

	# Load model
	base_model = AutoModelForCausalLM.from_pretrained(
	"microsoft/Phi-3-mini-4k-instruct",
	torch_dtype=torch.bfloat16,
	device_map="auto",
	trust_remote_code=True
	)

	model = PeftModel.from_pretrained(
	base_model,
	"YOUR_USERNAME/phi3-domain-classifier"
	)

	tokenizer = AutoTokenizer.from_pretrained(
	"YOUR_USERNAME/phi3-domain-classifier",
	trust_remote_code=True
	)

	# Configure for inference
	model.config.use_cache = False
	model.eval()

	# Classify a query
	def classify_domain(query):
	messages = [
	{"role": "system", "content": "You are a domain classifier. Respond with JSON."},
	{"role": "user", "content": f"Classify this query: {query}"}
	]

	inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	return_tensors="pt"
	).to(model.device)

	with torch.no_grad():
	outputs = model.generate(
	inputs,
	max_new_tokens=100,
	temperature=0.1,
	do_sample=True,
	pad_token_id=tokenizer.pad_token_id,
	eos_token_id=tokenizer.eos_token_id,
	use_cache=False
	)

	response = tokenizer.decode(
	outputs[0][inputs.shape[-1]:],
	skip_special_tokens=True
	)

	return json.loads(response)

	# Example
	result = classify_domain("Write a Python function to calculate factorial")
	print(result)
	# Output: {"primary_domain": "coding", "confidence": "high"}
	```

	### API Router Integration
	```python
	class SmartAPIRouter:
	"""Route queries to specialized LLM providers"""

	def __init__(self):
	self.classifier = DomainClassifier()
	self.provider_mapping = {
	"coding": "anthropic", # Claude for code
	"api_generation": "anthropic", # Claude for APIs
	"mathematics": "anthropic", # Claude for math
	"creative_content": "openai", # GPT-4 for creativity
	"general_knowledge": "openai", # GPT-4 for general Q&A
	# ... customize as needed
	}

	def route(self, query):
	result = self.classifier.classify(query)
	domain = result["primary_domain"]
	provider = self.provider_mapping.get(domain, "openai")

	return {
	"domain": domain,
	"routed_to": provider,
	"confidence": result["confidence"]
	}

	# Usage
	router = SmartAPIRouter()
	routing_info = router.route("Explain quantum entanglement")
	# Routes to appropriate LLM provider based on domain
	```

	## 📦 Model Details

	### Architecture

	- Base Model: microsoft/Phi-3-mini-4k-instruct (3.8B parameters)
	- Fine-tuning Method: LoRA (Low-Rank Adaptation)
	- LoRA Rank: 32
	- LoRA Alpha: 64
	- Target Modules: qkv_proj, o_proj, gate_up_proj, down_proj
	- Trainable Parameters: ~100M (2.6% of total)

	### Training Configuration

	- Epochs: 15
	- Batch Size: 4 (per device)
	- Gradient Accumulation: 8 steps (effective batch size: 32)
	- Learning Rate: 5e-5
	- LR Schedule: Cosine with 5% warmup
	- Optimizer: AdamW (fused)
	- Precision: BF16
	- Label Smoothing: 0.1
	- Gradient Clipping: 0.5

	### Training Hardware

	- GPU: NVIDIA A40 (48GB VRAM)
	- Training Time: ~7 hours
	- Framework: PyTorch 2.0+ with Transformers

	### Training Data

	- Total Samples: Custom dataset with domain-labeled queries
	- Train/Val/Test Split: 70/15/15
	- Domains: 15 categories
	- Format: Instruction-following with JSON output

	## 🎯 Use Cases

	### 1. Intelligent API Gateway
	Route user queries to the most appropriate LLM provider based on domain expertise.

	### 2. Multi-LLM Orchestration
	Distribute workload across multiple LLM providers based on their strengths.

	### 3. Cost Optimization
	Route simple queries to cheaper models, complex queries to premium providers.

	### 4. Query Analytics
	Analyze and categorize user query patterns for insights.

	### 5. Content Moderation
	Identify sensitive or ambiguous queries for special handling.

	## 🔒 Limitations

	- Language: Optimized for English queries only
	- Context Length: Limited to 4K tokens (Phi-3-mini constraint)
	- Domain Coverage: Fixed 15 domains; custom domains require retraining
	- Ambiguous Queries: May struggle with highly ambiguous or multi-domain queries
	- JSON Output: Expects structured JSON response; parsing may fail on malformed output

	## ⚖️ Ethical Considerations

	- Bias: Model may inherit biases from training data
	- Sensitive Content: Has dedicated "sensitive" category but should not replace human review
	- Privacy: No personal data used in training; user queries not logged by model
	- Transparency: Classification decisions are explainable through domain labels

	## 📄 License

	MIT License - Free for commercial and non-commercial use

	## 🙏 Acknowledgments

	- Base model: Microsoft Phi-3 team
	- Fine-tuning: HuggingFace PEFT library
	- Training infrastructure: NVIDIA A40 GPU

	## 📚 Citation

	If you use this model in your research or application, please cite:
	```bibtex
	@misc{phi3-domain-classifier,
	author = {Your Name},
	title = {Phi-3 Domain Classifier for Intelligent API Routing},
	year = {2024},
	publisher = {HuggingFace},
	howpublished = {\url{https://huggingface.co/YOUR_USERNAME/phi3-domain-classifier}},
	}
	```

	## 📞 Contact

	For questions, issues, or collaboration:
	- HuggingFace: [@YOUR_USERNAME](https://huggingface.co/YOUR_USERNAME)
	- GitHub: [(https://github.com/ovindumandith)]
	- Email: your.email@example.com

	## 🔄 Version History

	- v1.0 (2024-12-09): Initial release
	- 96.5% accuracy on 15-domain classification
	- Production-ready LoRA adapter
	- Optimized for API routing use cases

	---

	Built using Phi-3 and PEFT