scionoftech commited on
Commit
a57226e
·
verified ·
1 Parent(s): 00825fb

Upload 10 files

Browse files
Files changed (1) hide show
  1. README.md +342 -173
README.md CHANGED
@@ -1,209 +1,378 @@
1
  ---
2
- base_model: google/functiongemma-270m-it
3
- library_name: peft
4
- pipeline_tag: text-generation
 
5
  tags:
6
- - base_model:adapter:google/functiongemma-270m-it
 
 
7
  - lora
8
- - sft
9
- - transformers
10
- - trl
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  ---
12
 
13
- # Model Card for Model ID
14
-
15
- <!-- Provide a quick summary of what the model is/does. -->
16
-
17
-
18
-
19
- ## Model Details
20
-
21
- ### Model Description
22
-
23
- <!-- Provide a longer summary of what this model is. -->
24
-
25
 
 
26
 
27
- - **Developed by:** [More Information Needed]
28
- - **Funded by [optional]:** [More Information Needed]
29
- - **Shared by [optional]:** [More Information Needed]
30
- - **Model type:** [More Information Needed]
31
- - **Language(s) (NLP):** [More Information Needed]
32
- - **License:** [More Information Needed]
33
- - **Finetuned from model [optional]:** [More Information Needed]
34
 
35
- ### Model Sources [optional]
36
 
37
- <!-- Provide the basic links for the model. -->
38
 
39
- - **Repository:** [More Information Needed]
40
- - **Paper [optional]:** [More Information Needed]
41
- - **Demo [optional]:** [More Information Needed]
42
 
43
- ## Uses
 
 
 
 
 
44
 
45
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
46
 
47
- ### Direct Use
 
 
 
 
 
 
 
48
 
49
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
50
 
51
- [More Information Needed]
 
 
 
 
52
 
53
- ### Downstream Use [optional]
54
 
55
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
56
 
57
- [More Information Needed]
 
 
 
 
 
 
58
 
59
  ### Out-of-Scope Use
60
 
61
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
62
-
63
- [More Information Needed]
64
-
65
- ## Bias, Risks, and Limitations
66
-
67
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
68
-
69
- [More Information Needed]
70
-
71
- ### Recommendations
72
-
73
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
74
-
75
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
76
-
77
- ## How to Get Started with the Model
78
-
79
- Use the code below to get started with the model.
80
-
81
- [More Information Needed]
82
-
83
- ## Training Details
84
-
85
- ### Training Data
86
-
87
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
88
-
89
- [More Information Needed]
90
-
91
- ### Training Procedure
92
-
93
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
94
-
95
- #### Preprocessing [optional]
96
-
97
- [More Information Needed]
98
-
99
-
100
- #### Training Hyperparameters
101
-
102
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
103
-
104
- #### Speeds, Sizes, Times [optional]
105
-
106
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
107
-
108
- [More Information Needed]
109
-
110
- ## Evaluation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
111
 
112
- <!-- This section describes the evaluation protocols and provides the results. -->
 
 
 
113
 
114
- ### Testing Data, Factors & Metrics
 
115
 
116
- #### Testing Data
117
 
118
- <!-- This should link to a Dataset Card if possible. -->
 
 
 
 
 
 
 
 
 
 
 
119
 
120
- [More Information Needed]
121
 
122
- #### Factors
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
123
 
124
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
 
 
125
 
126
- [More Information Needed]
 
 
127
 
128
- #### Metrics
129
 
130
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
131
 
132
- [More Information Needed]
 
 
 
133
 
134
- ### Results
135
 
136
- [More Information Needed]
137
 
138
- #### Summary
139
-
140
-
141
-
142
- ## Model Examination [optional]
143
-
144
- <!-- Relevant interpretability work for the model goes here -->
145
-
146
- [More Information Needed]
147
-
148
- ## Environmental Impact
149
-
150
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
151
-
152
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
153
-
154
- - **Hardware Type:** [More Information Needed]
155
- - **Hours used:** [More Information Needed]
156
- - **Cloud Provider:** [More Information Needed]
157
- - **Compute Region:** [More Information Needed]
158
- - **Carbon Emitted:** [More Information Needed]
159
-
160
- ## Technical Specifications [optional]
161
-
162
- ### Model Architecture and Objective
163
-
164
- [More Information Needed]
165
-
166
- ### Compute Infrastructure
167
-
168
- [More Information Needed]
169
-
170
- #### Hardware
171
-
172
- [More Information Needed]
173
-
174
- #### Software
175
-
176
- [More Information Needed]
177
-
178
- ## Citation [optional]
179
-
180
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
181
-
182
- **BibTeX:**
183
-
184
- [More Information Needed]
185
-
186
- **APA:**
187
-
188
- [More Information Needed]
189
-
190
- ## Glossary [optional]
191
-
192
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
193
-
194
- [More Information Needed]
195
-
196
- ## More Information [optional]
197
-
198
- [More Information Needed]
199
-
200
- ## Model Card Authors [optional]
201
-
202
- [More Information Needed]
203
-
204
- ## Model Card Contact
205
-
206
- [More Information Needed]
207
- ### Framework versions
208
 
209
- - PEFT 0.18.0
 
1
  ---
2
+ language:
3
+ - en
4
+ license: gemma
5
+ library_name: transformers
6
  tags:
7
+ - function-calling
8
+ - agent-routing
9
+ - multi-agent
10
  - lora
11
+ - peft
12
+ - gemma
13
+ - functiongemma
14
+ - customer-support
15
+ - e-commerce
16
+ base_model: google/functiongemma-270m-it
17
+ datasets:
18
+ - scionoftech/functiongemma-e-commerce-dataset
19
+ model-index:
20
+ - name: functiongemma-270m-ecommerce-router
21
+ results:
22
+ - task:
23
+ type: text-classification
24
+ name: Agent Routing
25
+ dataset:
26
+ name: E-commerce Customer Support Routing
27
+ type: scionoftech/ecommerce-agent-routing
28
+ metrics:
29
+ - type: accuracy
30
+ value: 89.4
31
+ name: Routing Accuracy
32
+ - type: f1
33
+ value: 89.0
34
+ name: Macro F1 Score
35
  ---
36
 
37
+ # FunctionGemma 270M - E-Commerce Multi-Agent Router
 
 
 
 
 
 
 
 
 
 
 
38
 
39
+ Fine-tuned version of [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it) for intelligent routing of customer queries across 7 specialized agents in e-commerce customer support systems.
40
 
41
+ ## Model Description
 
 
 
 
 
 
42
 
43
+ This model demonstrates how FunctionGemma can be adapted beyond mobile actions for **multi-agent orchestration** in enterprise systems. It intelligently routes natural language customer queries to the appropriate specialized agent with **89.4% accuracy**.
44
 
45
+ **Key Achievement:** Replacing brittle rule-based routing (52-58% accuracy) with learned intelligence using only 1.47M trainable parameters (0.55% of the model).
46
 
47
+ ### Architecture
 
 
48
 
49
+ - **Base Model:** google/functiongemma-270m-it (270M parameters)
50
+ - **Fine-tuning Method:** LoRA (Low-Rank Adaptation)
51
+ - **Trainable Parameters:** 1,474,560 (0.55%)
52
+ - **LoRA Rank:** 16
53
+ - **LoRA Alpha:** 32
54
+ - **Target Modules:** q_proj, k_proj, v_proj, o_proj
55
 
56
+ ### Training Details
57
 
58
+ - **Training Data:** 12,550 synthetic customer queries (balanced across 7 agents)
59
+ - **Training Time:** 45 minutes on Google Colab T4 GPU
60
+ - **Framework:** Hugging Face Transformers + PEFT + TRL
61
+ - **Quantization:** 4-bit NF4 during training
62
+ - **Optimizer:** paged_adamw_8bit
63
+ - **Learning Rate:** 2e-4
64
+ - **Epochs:** 3
65
+ - **Batch Size:** 4 (effective 16 with gradient accumulation)
66
 
67
+ ## Intended Use
68
 
69
+ ### Primary Use Case
70
+ **Multi-agent customer support routing** for e-commerce platforms:
71
+ - Route queries to order management, product search, returns, payments, account, technical support agents
72
+ - Maintain conversation context across multi-turn interactions
73
+ - Enable intelligent task switching
74
 
75
+ ### Supported Agents
76
 
77
+ The model routes queries to 7 specialized agents:
78
 
79
+ 1. **Order Management** (`route_to_order_agent`) - Track orders, update delivery, cancel orders
80
+ 2. **Product Search** (`route_to_search_agent`) - Search catalog, check availability, recommendations
81
+ 3. **Product Details** (`route_to_details_agent`) - Specifications, reviews, comparisons
82
+ 4. **Returns & Refunds** (`route_to_returns_agent`) - Initiate returns, process refunds, exchanges
83
+ 5. **Account Management** (`route_to_account_agent`) - Update profile, manage addresses, security
84
+ 6. **Payment Support** (`route_to_payment_agent`) - Resolve payment issues, update methods, billing
85
+ 7. **Technical Support** (`route_to_technical_agent`) - Fix app/website issues, login problems
86
 
87
  ### Out-of-Scope Use
88
 
89
+ - General-purpose chatbot (use base Gemma models instead)
90
+ - ❌ Direct dialogue generation (this is a routing model)
91
+ - ❌ More than 20 agents (context window limitations)
92
+ - ❌ Non-customer-support domains without fine-tuning
93
+
94
+ ## Performance
95
+
96
+ ### Test Set Results
97
+
98
+ ```
99
+ Overall Accuracy: 89.40% (1,684/1,883 correct)
100
+
101
+ Per-Agent Performance:
102
+ order_management 92.3% (251/272)
103
+ product_search 91.1% (257/282)
104
+ product_details 94.7% (233/246)
105
+ returns_refunds 88.2% (238/270)
106
+ account_management 85.1% (229/269)
107
+ payment_support 89.5% (241/269)
108
+ technical_support 87.0% (234/269)
109
+ ```
110
+
111
+ ### Comparison to Baselines
112
+
113
+ | Approach | Accuracy | Latency | Memory |
114
+ |----------|----------|---------|--------|
115
+ | Keyword Matching | 52-58% | 5ms | Negligible |
116
+ | Rule-based (100 rules) | 65-70% | 8ms | Negligible |
117
+ | BERT Classifier (300M) | 82-85% | 45ms | 400 MB |
118
+ | **This Model (LoRA)** | **89.4%** | **127ms** | **2.1 GB** |
119
+ | GPT-4 API (zero-shot) | 85-90% | 2500ms | Cloud |
120
+
121
+ ### Latency Breakdown (T4 GPU)
122
+
123
+ - **Routing Decision:** 127ms average
124
+ - **Agent Execution:** ~52ms average
125
+ - **Total End-to-End:** ~179ms average
126
+
127
+ ## How to Use
128
+
129
+ ### Installation
130
+
131
+ ```bash
132
+ pip install transformers peft torch accelerate bitsandbytes
133
+ ```
134
+
135
+ ### Quick Start
136
+
137
+ ```python
138
+ from transformers import AutoModelForCausalLM, AutoTokenizer
139
+ from peft import PeftModel
140
+ import torch
141
+
142
+ # Load base model
143
+ base_model = AutoModelForCausalLM.from_pretrained(
144
+ "google/functiongemma-270m-it",
145
+ device_map="auto",
146
+ torch_dtype=torch.bfloat16
147
+ )
148
+
149
+ # Load LoRA adapters
150
+ model = PeftModel.from_pretrained(
151
+ base_model,
152
+ "scionoftech/functiongemma-270m-ecommerce-router"
153
+ )
154
+
155
+ tokenizer = AutoTokenizer.from_pretrained("google/functiongemma-270m-it")
156
+
157
+ # Define available agents
158
+ agent_declarations = """<start_function_declaration>
159
+ route_to_order_agent(): Track, update, or cancel customer orders
160
+ route_to_search_agent(): Search products, check availability
161
+ route_to_details_agent(): Get product specifications and reviews
162
+ route_to_returns_agent(): Handle returns, refunds, exchanges
163
+ route_to_account_agent(): Manage user profile and settings
164
+ route_to_payment_agent(): Resolve payment and billing issues
165
+ route_to_technical_agent(): Fix app, website, login issues
166
+ <end_function_declaration>"""
167
+
168
+ # Route a query
169
+ query = "Where is my order?"
170
+
171
+ prompt = f"""<start_of_turn>user
172
+ {agent_declarations}
173
+
174
+ User query: {query}<end_of_turn>
175
+ <start_of_turn>model
176
+ """
177
+
178
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
179
+
180
+ with torch.no_grad():
181
+ outputs = model.generate(
182
+ **inputs,
183
+ max_new_tokens=30,
184
+ do_sample=False,
185
+ pad_token_id=tokenizer.eos_token_id
186
+ )
187
+
188
+ response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=False)
189
+ print(response)
190
+ # Output: <function_call>route_to_order_agent</function_call>
191
+ ```
192
+
193
+ ### Production Deployment (4-bit Quantization)
194
+
195
+ ```python
196
+ from transformers import AutoModelForCausalLM, BitsAndBytesConfig
197
+ from peft import PeftModel
198
+
199
+ # 4-bit quantization config
200
+ quant_config = BitsAndBytesConfig(
201
+ load_in_4bit=True,
202
+ bnb_4bit_quant_type="nf4",
203
+ bnb_4bit_compute_dtype=torch.bfloat16
204
+ )
205
+
206
+ # Load with quantization
207
+ base_model = AutoModelForCausalLM.from_pretrained(
208
+ "google/functiongemma-270m-it",
209
+ quantization_config=quant_config,
210
+ device_map="auto"
211
+ )
212
+
213
+ model = PeftModel.from_pretrained(
214
+ base_model,
215
+ "scionoftech/functiongemma-270m-ecommerce-router"
216
+ )
217
+
218
+ # Result: 180 MB model, 132ms latency, 89.1% accuracy
219
+ ```
220
+
221
+ ### Parsing Function Calls
222
+
223
+ ```python
224
+ import re
225
+
226
+ def extract_agent_function(response: str) -> str:
227
+ """Extract function name from FunctionGemma output."""
228
+ match = re.search(r'<function_call>([a-zA-Z_]+)</function_call>', response)
229
+ return match.group(1) if match else "unknown"
230
+
231
+ # Usage
232
+ agent = extract_agent_function(response)
233
+ print(f"Route to: {agent}")
234
+ # Output: Route to: route_to_order_agent
235
+ ```
236
+
237
+ ## Training Procedure
238
+
239
+ ### Dataset Preparation
240
+
241
+ Generated 12,550 synthetic examples with linguistic variations:
242
+
243
+ ```python
244
+ # Example training format
245
+ {
246
+ "query": "Please track my package ASAP",
247
+ "function": "route_to_order_agent",
248
+ "agent": "order_management"
249
+ }
250
+ ```
251
+
252
+ Variations included:
253
+ - Polite forms: "Please", "Could you", "Can you"
254
+ - Casual starters: "Hey", "Hi", "Um"
255
+ - Urgency markers: "ASAP", "urgently", "immediately"
256
+ - Edge cases and ambiguous queries
257
+
258
+ ### Training Configuration
259
+
260
+ ```python
261
+ from transformers import TrainingArguments
262
+ from trl import SFTTrainer
263
+ from peft import LoraConfig
264
+
265
+ # LoRA config
266
+ lora_config = LoraConfig(
267
+ r=16,
268
+ lora_alpha=32,
269
+ target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
270
+ lora_dropout=0.05,
271
+ bias="none",
272
+ task_type="CAUSAL_LM"
273
+ )
274
+
275
+ # Training args
276
+ training_args = TrainingArguments(
277
+ output_dir="./functiongemma-ecommerce-router",
278
+ num_train_epochs=3,
279
+ per_device_train_batch_size=4,
280
+ gradient_accumulation_steps=4,
281
+ learning_rate=2e-4,
282
+ lr_scheduler_type="cosine",
283
+ warmup_ratio=0.1,
284
+ weight_decay=0.01,
285
+ bf16=True,
286
+ optim="paged_adamw_8bit",
287
+ logging_steps=20,
288
+ eval_strategy="epoch",
289
+ save_strategy="epoch"
290
+ )
291
+ ```
292
+
293
+ ### Training Results
294
+
295
+ - **Final Training Loss:** 0.0182
296
+ - **Final Validation Loss:** 0.0198
297
+ - **Training Time:** 45 minutes (T4 GPU)
298
+ - **Peak Memory:** 11.2 GB
299
+
300
+ ## Limitations and Biases
301
+
302
+ ### Known Limitations
303
+
304
+ 1. **Ambiguous Queries:** 10.6% error rate concentrated in genuinely ambiguous queries
305
+ - Example: "I need help" (could be any agent)
306
+ - Mitigation: Implement confidence-based clarification (confidence < 0.7)
307
+
308
+ 2. **Context Dependency:** Requires conversation state management for multi-turn interactions
309
+ - Solution: Use durable workflow orchestrators (Temporal, Cadence)
310
 
311
+ 3. **Agent Confusion:** Most common misclassifications:
312
+ - Returns ↔ Order Management (12 cases)
313
+ - Account ↔ Payment (8 cases)
314
+ - Technical ↔ Product Details (6 cases)
315
 
316
+ 4. **Language:** Trained only on English queries
317
+ - For multilingual support, fine-tune on translated datasets
318
 
319
+ ### Biases
320
 
321
+ - **Domain-Specific:** Trained exclusively on e-commerce customer support
322
+ - **Synthetic Data:** Generated examples may not capture all real-world variations
323
+ - **Agent Distribution:** Balanced training may not reflect real query distributions
324
+
325
+ ## Ethical Considerations
326
+
327
+ - **Misrouting Impact:** Incorrect routing may frustrate customers or delay issue resolution
328
+ - **Recommendation:** Implement fallback to human agents for low-confidence predictions
329
+ - **Privacy:** Model doesn't store user data; conversation state managed externally
330
+ - **Fairness:** Ensure equal routing performance across user demographics
331
+
332
+ ## Citation
333
 
334
+ If you use this model in your research or production systems, please cite:
335
 
336
+ ```bibtex
337
+ @misc{functiongemma-ecommerce-router,
338
+ author = {Sai Kumar Yava},
339
+ title = {FunctionGemma 270M Fine-tuned for E-Commerce Multi-Agent Routing},
340
+ year = {2025},
341
+ publisher = {HuggingFace},
342
+ howpublished = {\url{https://huggingface.co/scionoftech/functiongemma-270m-ecommerce-router}},
343
+ }
344
+
345
+ @article{functiongemma2025,
346
+ title={FunctionGemma: Bringing bespoke function calling to the edge},
347
+ author={Google DeepMind},
348
+ year={2025},
349
+ url={https://blog.google/technology/developers/functiongemma/}
350
+ }
351
+ ```
352
+
353
+ ## Acknowledgments
354
 
355
+ - Google DeepMind for FunctionGemma base model
356
+ - Hugging Face for PEFT and Transformers libraries
357
+ - The open-source AI community
358
 
359
+ ## License
360
+
361
+ This model inherits the Gemma license from the base model. See [Gemma Terms of Use](https://ai.google.dev/gemma/terms).
362
 
363
+ **Commercial Use:** Permitted under Gemma license terms.
364
 
365
+ ## Related Resources
366
 
367
+ - **Training Notebook:** [Google Colab](https://colab.research.google.com/github/scionoftech/functiongemma-finetuning-e-commerce/blob/main/FunctionGemma_fine_tuning.ipynb)
368
+ - **GitHub Repository:** [Complete code](https://github.com/scionoftech/functiongemma-finetuning-e-commerce)
369
+ - **Dataset:** [Training data](https://huggingface.co/datasets/scionoftech/functiongemma-e-commerce-dataset)
370
+ - **Base Model:** [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it)
371
 
372
+ ## Updates
373
 
374
+ - **2025-12-25:** Initial release - 89.4% routing accuracy on e-commerce customer support
375
 
376
+ ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
377
 
378
+ **Questions?** Open an issue on [GitHub](https://github.com/scionoftech/functiongemma-finetuning-e-commerce/issues)