ovinduG commited on
Commit
be0469a
Β·
verified Β·
1 Parent(s): 296b3a4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +273 -176
README.md CHANGED
@@ -1,207 +1,304 @@
1
  ---
 
2
  base_model: microsoft/Phi-3-mini-4k-instruct
3
- library_name: peft
4
- pipeline_tag: text-generation
5
  tags:
6
- - base_model:adapter:microsoft/Phi-3-mini-4k-instruct
 
 
7
  - lora
8
- - transformers
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  ---
10
 
11
- # Model Card for Model ID
12
-
13
- <!-- Provide a quick summary of what the model is/does. -->
14
-
15
-
16
-
17
- ## Model Details
18
-
19
- ### Model Description
20
-
21
- <!-- Provide a longer summary of what this model is. -->
22
-
23
-
24
-
25
- - **Developed by:** [More Information Needed]
26
- - **Funded by [optional]:** [More Information Needed]
27
- - **Shared by [optional]:** [More Information Needed]
28
- - **Model type:** [More Information Needed]
29
- - **Language(s) (NLP):** [More Information Needed]
30
- - **License:** [More Information Needed]
31
- - **Finetuned from model [optional]:** [More Information Needed]
32
-
33
- ### Model Sources [optional]
34
-
35
- <!-- Provide the basic links for the model. -->
36
-
37
- - **Repository:** [More Information Needed]
38
- - **Paper [optional]:** [More Information Needed]
39
- - **Demo [optional]:** [More Information Needed]
40
-
41
- ## Uses
42
-
43
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
44
-
45
- ### Direct Use
46
-
47
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
48
-
49
- [More Information Needed]
50
-
51
- ### Downstream Use [optional]
52
-
53
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
54
-
55
- [More Information Needed]
56
-
57
- ### Out-of-Scope Use
58
-
59
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
60
-
61
- [More Information Needed]
62
-
63
- ## Bias, Risks, and Limitations
64
-
65
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
66
-
67
- [More Information Needed]
68
-
69
- ### Recommendations
70
-
71
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
72
-
73
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
74
-
75
- ## How to Get Started with the Model
76
-
77
- Use the code below to get started with the model.
78
-
79
- [More Information Needed]
80
-
81
- ## Training Details
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
82
 
83
  ### Training Data
84
 
85
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
86
-
87
- [More Information Needed]
88
-
89
- ### Training Procedure
90
 
91
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
92
 
93
- #### Preprocessing [optional]
 
94
 
95
- [More Information Needed]
 
96
 
 
 
97
 
98
- #### Training Hyperparameters
 
99
 
100
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 
101
 
102
- #### Speeds, Sizes, Times [optional]
103
 
104
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
 
 
 
 
105
 
106
- [More Information Needed]
107
 
108
- ## Evaluation
 
 
 
109
 
110
- <!-- This section describes the evaluation protocols and provides the results. -->
111
 
112
- ### Testing Data, Factors & Metrics
113
 
114
- #### Testing Data
115
 
116
- <!-- This should link to a Dataset Card if possible. -->
 
 
117
 
118
- [More Information Needed]
119
 
120
- #### Factors
 
 
 
 
 
 
 
 
 
121
 
122
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
123
 
124
- [More Information Needed]
 
 
 
125
 
126
- #### Metrics
127
 
128
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
 
 
 
129
 
130
- [More Information Needed]
131
-
132
- ### Results
133
-
134
- [More Information Needed]
135
-
136
- #### Summary
137
-
138
-
139
-
140
- ## Model Examination [optional]
141
-
142
- <!-- Relevant interpretability work for the model goes here -->
143
-
144
- [More Information Needed]
145
-
146
- ## Environmental Impact
147
-
148
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
149
-
150
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
151
-
152
- - **Hardware Type:** [More Information Needed]
153
- - **Hours used:** [More Information Needed]
154
- - **Cloud Provider:** [More Information Needed]
155
- - **Compute Region:** [More Information Needed]
156
- - **Carbon Emitted:** [More Information Needed]
157
-
158
- ## Technical Specifications [optional]
159
-
160
- ### Model Architecture and Objective
161
-
162
- [More Information Needed]
163
-
164
- ### Compute Infrastructure
165
-
166
- [More Information Needed]
167
-
168
- #### Hardware
169
-
170
- [More Information Needed]
171
-
172
- #### Software
173
-
174
- [More Information Needed]
175
-
176
- ## Citation [optional]
177
-
178
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
179
-
180
- **BibTeX:**
181
-
182
- [More Information Needed]
183
-
184
- **APA:**
185
-
186
- [More Information Needed]
187
-
188
- ## Glossary [optional]
189
-
190
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
191
-
192
- [More Information Needed]
193
-
194
- ## More Information [optional]
195
-
196
- [More Information Needed]
197
-
198
- ## Model Card Authors [optional]
199
-
200
- [More Information Needed]
201
-
202
- ## Model Card Contact
203
-
204
- [More Information Needed]
205
- ### Framework versions
206
 
207
- - PEFT 0.18.0
 
1
  ---
2
+ license: mit
3
  base_model: microsoft/Phi-3-mini-4k-instruct
 
 
4
  tags:
5
+ - text-classification
6
+ - domain-classification
7
+ - phi-3
8
  - lora
9
+ - peft
10
+ - api-routing
11
+ - llm-routing
12
+ language:
13
+ - en
14
+ metrics:
15
+ - accuracy
16
+ - f1
17
+ library_name: peft
18
+ pipeline_tag: text-classification
19
+ datasets:
20
+ - custom
21
+ widget:
22
+ - text: "Write a Python function to calculate factorial"
23
+ example_title: "Coding Query"
24
+ - text: "Generate an OpenAPI specification for a user management API"
25
+ example_title: "API Generation"
26
+ - text: "What is quantum mechanics?"
27
+ example_title: "Science Query"
28
+ - text: "Analyze sales data to find trends"
29
+ example_title: "Data Analysis"
30
+ - text: "Write a poem about the ocean"
31
+ example_title: "Creative Content"
32
  ---
33
 
34
+ # Phi-3 Domain Classifier for Intelligent API Routing
35
+
36
+ **🎯 96.5% Accuracy | 15 Domain Categories | Production-Ready**
37
+
38
+ A fine-tuned Phi-3-mini model for classifying user queries into specific domains, enabling intelligent routing to specialized LLM providers in API management systems.
39
+
40
+ ## πŸš€ Key Features
41
+
42
+ - βœ… **High Accuracy**: 96.5% on test set
43
+ - βœ… **Fast Inference**: ~35-45ms per query
44
+ - βœ… **Lightweight**: Only ~100MB LoRA adapters
45
+ - βœ… **15 Domains**: Comprehensive coverage
46
+ - βœ… **Production-Ready**: Battle-tested on real queries
47
+
48
+ ## πŸ“Š Performance Metrics
49
+
50
+ | Metric | Score |
51
+ |--------|-------|
52
+ | **Accuracy** | 96.50% |
53
+ | **F1 Score (Weighted)** | 0.9649 |
54
+ | **F1 Score (Macro)** | 0.9679 |
55
+ | **Precision (Macro)** | 0.97 |
56
+ | **Recall (Macro)** | 0.97 |
57
+
58
+ ### Per-Domain Performance
59
+
60
+ | Domain | Precision | Recall | F1-Score |
61
+ |--------|-----------|--------|----------|
62
+ | coding | 0.86 | 0.92 | 0.89 |
63
+ | api_generation | 1.00 | 0.90 | 0.95 |
64
+ | mathematics | 1.00 | 1.00 | 1.00 |
65
+ | data_analysis | 0.92 | 1.00 | 0.96 |
66
+ | science | 1.00 | 1.00 | 1.00 |
67
+ | medicine | 0.93 | 1.00 | 0.96 |
68
+ | business | 0.88 | 1.00 | 0.93 |
69
+ | law | 0.91 | 1.00 | 0.95 |
70
+ | technology | 1.00 | 1.00 | 1.00 |
71
+ | literature | 1.00 | 1.00 | 1.00 |
72
+ | creative_content | 1.00 | 1.00 | 1.00 |
73
+ | education | 1.00 | 0.93 | 0.96 |
74
+ | general_knowledge | 1.00 | 0.84 | 0.91 |
75
+ | ambiguous | 1.00 | 1.00 | 1.00 |
76
+ | sensitive | 1.00 | 1.00 | 1.00 |
77
+
78
+ ## 🎯 Supported Domains
79
+
80
+ 1. **coding** - Programming, algorithms, code generation
81
+ 2. **api_generation** - OpenAPI specs, API design, REST/GraphQL
82
+ 3. **mathematics** - Math problems, equations, calculations
83
+ 4. **data_analysis** - Data science, statistics, analysis
84
+ 5. **science** - Physics, chemistry, biology, scientific concepts
85
+ 6. **medicine** - Medical queries, health information
86
+ 7. **business** - Business strategy, finance, management
87
+ 8. **law** - Legal questions, regulations, compliance
88
+ 9. **technology** - Tech concepts, hardware, software
89
+ 10. **literature** - Books, writing, literary analysis
90
+ 11. **creative_content** - Creative writing, poetry, storytelling
91
+ 12. **education** - Teaching, learning, academic topics
92
+ 13. **general_knowledge** - General Q&A, trivia
93
+ 14. **ambiguous** - Unclear or multi-domain queries
94
+ 15. **sensitive** - Sensitive topics requiring careful handling
95
+
96
+ ## πŸ”§ Usage
97
+
98
+ ### Basic Classification
99
+ ```python
100
+ from transformers import AutoTokenizer, AutoModelForCausalLM
101
+ from peft import PeftModel
102
+ import torch
103
+ import json
104
+
105
+ # Load model
106
+ base_model = AutoModelForCausalLM.from_pretrained(
107
+ "microsoft/Phi-3-mini-4k-instruct",
108
+ torch_dtype=torch.bfloat16,
109
+ device_map="auto",
110
+ trust_remote_code=True
111
+ )
112
+
113
+ model = PeftModel.from_pretrained(
114
+ base_model,
115
+ "YOUR_USERNAME/phi3-domain-classifier"
116
+ )
117
+
118
+ tokenizer = AutoTokenizer.from_pretrained(
119
+ "YOUR_USERNAME/phi3-domain-classifier",
120
+ trust_remote_code=True
121
+ )
122
+
123
+ # Configure for inference
124
+ model.config.use_cache = False
125
+ model.eval()
126
+
127
+ # Classify a query
128
+ def classify_domain(query):
129
+ messages = [
130
+ {"role": "system", "content": "You are a domain classifier. Respond with JSON."},
131
+ {"role": "user", "content": f"Classify this query: {query}"}
132
+ ]
133
+
134
+ inputs = tokenizer.apply_chat_template(
135
+ messages,
136
+ add_generation_prompt=True,
137
+ return_tensors="pt"
138
+ ).to(model.device)
139
+
140
+ with torch.no_grad():
141
+ outputs = model.generate(
142
+ inputs,
143
+ max_new_tokens=100,
144
+ temperature=0.1,
145
+ do_sample=True,
146
+ pad_token_id=tokenizer.pad_token_id,
147
+ eos_token_id=tokenizer.eos_token_id,
148
+ use_cache=False
149
+ )
150
+
151
+ response = tokenizer.decode(
152
+ outputs[0][inputs.shape[-1]:],
153
+ skip_special_tokens=True
154
+ )
155
+
156
+ return json.loads(response)
157
+
158
+ # Example
159
+ result = classify_domain("Write a Python function to calculate factorial")
160
+ print(result)
161
+ # Output: {"primary_domain": "coding", "confidence": "high"}
162
+ ```
163
+
164
+ ### API Router Integration
165
+ ```python
166
+ class SmartAPIRouter:
167
+ """Route queries to specialized LLM providers"""
168
+
169
+ def __init__(self):
170
+ self.classifier = DomainClassifier()
171
+ self.provider_mapping = {
172
+ "coding": "anthropic", # Claude for code
173
+ "api_generation": "anthropic", # Claude for APIs
174
+ "mathematics": "anthropic", # Claude for math
175
+ "creative_content": "openai", # GPT-4 for creativity
176
+ "general_knowledge": "openai", # GPT-4 for general Q&A
177
+ # ... customize as needed
178
+ }
179
+
180
+ def route(self, query):
181
+ result = self.classifier.classify(query)
182
+ domain = result["primary_domain"]
183
+ provider = self.provider_mapping.get(domain, "openai")
184
+
185
+ return {
186
+ "domain": domain,
187
+ "routed_to": provider,
188
+ "confidence": result["confidence"]
189
+ }
190
+
191
+ # Usage
192
+ router = SmartAPIRouter()
193
+ routing_info = router.route("Explain quantum entanglement")
194
+ # Routes to appropriate LLM provider based on domain
195
+ ```
196
+
197
+ ## πŸ“¦ Model Details
198
+
199
+ ### Architecture
200
+
201
+ - **Base Model**: microsoft/Phi-3-mini-4k-instruct (3.8B parameters)
202
+ - **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
203
+ - **LoRA Rank**: 32
204
+ - **LoRA Alpha**: 64
205
+ - **Target Modules**: qkv_proj, o_proj, gate_up_proj, down_proj
206
+ - **Trainable Parameters**: ~100M (2.6% of total)
207
+
208
+ ### Training Configuration
209
+
210
+ - **Epochs**: 15
211
+ - **Batch Size**: 4 (per device)
212
+ - **Gradient Accumulation**: 8 steps (effective batch size: 32)
213
+ - **Learning Rate**: 5e-5
214
+ - **LR Schedule**: Cosine with 5% warmup
215
+ - **Optimizer**: AdamW (fused)
216
+ - **Precision**: BF16
217
+ - **Label Smoothing**: 0.1
218
+ - **Gradient Clipping**: 0.5
219
+
220
+ ### Training Hardware
221
+
222
+ - **GPU**: NVIDIA A40 (48GB VRAM)
223
+ - **Training Time**: ~7 hours
224
+ - **Framework**: PyTorch 2.0+ with Transformers
225
 
226
  ### Training Data
227
 
228
+ - **Total Samples**: Custom dataset with domain-labeled queries
229
+ - **Train/Val/Test Split**: 70/15/15
230
+ - **Domains**: 15 categories
231
+ - **Format**: Instruction-following with JSON output
 
232
 
233
+ ## 🎯 Use Cases
234
 
235
+ ### 1. Intelligent API Gateway
236
+ Route user queries to the most appropriate LLM provider based on domain expertise.
237
 
238
+ ### 2. Multi-LLM Orchestration
239
+ Distribute workload across multiple LLM providers based on their strengths.
240
 
241
+ ### 3. Cost Optimization
242
+ Route simple queries to cheaper models, complex queries to premium providers.
243
 
244
+ ### 4. Query Analytics
245
+ Analyze and categorize user query patterns for insights.
246
 
247
+ ### 5. Content Moderation
248
+ Identify sensitive or ambiguous queries for special handling.
249
 
250
+ ## πŸ”’ Limitations
251
 
252
+ - **Language**: Optimized for English queries only
253
+ - **Context Length**: Limited to 4K tokens (Phi-3-mini constraint)
254
+ - **Domain Coverage**: Fixed 15 domains; custom domains require retraining
255
+ - **Ambiguous Queries**: May struggle with highly ambiguous or multi-domain queries
256
+ - **JSON Output**: Expects structured JSON response; parsing may fail on malformed output
257
 
258
+ ## βš–οΈ Ethical Considerations
259
 
260
+ - **Bias**: Model may inherit biases from training data
261
+ - **Sensitive Content**: Has dedicated "sensitive" category but should not replace human review
262
+ - **Privacy**: No personal data used in training; user queries not logged by model
263
+ - **Transparency**: Classification decisions are explainable through domain labels
264
 
265
+ ## πŸ“„ License
266
 
267
+ MIT License - Free for commercial and non-commercial use
268
 
269
+ ## πŸ™ Acknowledgments
270
 
271
+ - Base model: Microsoft Phi-3 team
272
+ - Fine-tuning: HuggingFace PEFT library
273
+ - Training infrastructure: NVIDIA A40 GPU
274
 
275
+ ## πŸ“š Citation
276
 
277
+ If you use this model in your research or application, please cite:
278
+ ```bibtex
279
+ @misc{phi3-domain-classifier,
280
+ author = {Your Name},
281
+ title = {Phi-3 Domain Classifier for Intelligent API Routing},
282
+ year = {2024},
283
+ publisher = {HuggingFace},
284
+ howpublished = {\url{https://huggingface.co/YOUR_USERNAME/phi3-domain-classifier}},
285
+ }
286
+ ```
287
 
288
+ ## πŸ“ž Contact
289
 
290
+ For questions, issues, or collaboration:
291
+ - **HuggingFace**: [@YOUR_USERNAME](https://huggingface.co/YOUR_USERNAME)
292
+ - **GitHub**: [(https://github.com/ovindumandith)]
293
+ - **Email**: your.email@example.com
294
 
295
+ ## πŸ”„ Version History
296
 
297
+ - **v1.0** (2024-12-09): Initial release
298
+ - 96.5% accuracy on 15-domain classification
299
+ - Production-ready LoRA adapter
300
+ - Optimized for API routing use cases
301
 
302
+ ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
303
 
304
+ **Built using Phi-3 and PEFT**