File size: 11,537 Bytes
00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e 00825fb a57226e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 |
---
language:
- en
license: gemma
library_name: transformers
tags:
- function-calling
- agent-routing
- multi-agent
- lora
- peft
- gemma
- functiongemma
- customer-support
- e-commerce
base_model: google/functiongemma-270m-it
datasets:
- scionoftech/functiongemma-e-commerce-dataset
model-index:
- name: functiongemma-270m-ecommerce-router
results:
- task:
type: text-classification
name: Agent Routing
dataset:
name: E-commerce Customer Support Routing
type: scionoftech/ecommerce-agent-routing
metrics:
- type: accuracy
value: 89.4
name: Routing Accuracy
- type: f1
value: 89.0
name: Macro F1 Score
---
# FunctionGemma 270M - E-Commerce Multi-Agent Router
Fine-tuned version of [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it) for intelligent routing of customer queries across 7 specialized agents in e-commerce customer support systems.
## Model Description
This model demonstrates how FunctionGemma can be adapted beyond mobile actions for **multi-agent orchestration** in enterprise systems. It intelligently routes natural language customer queries to the appropriate specialized agent with **89.4% accuracy**.
**Key Achievement:** Replacing brittle rule-based routing (52-58% accuracy) with learned intelligence using only 1.47M trainable parameters (0.55% of the model).
### Architecture
- **Base Model:** google/functiongemma-270m-it (270M parameters)
- **Fine-tuning Method:** LoRA (Low-Rank Adaptation)
- **Trainable Parameters:** 1,474,560 (0.55%)
- **LoRA Rank:** 16
- **LoRA Alpha:** 32
- **Target Modules:** q_proj, k_proj, v_proj, o_proj
### Training Details
- **Training Data:** 12,550 synthetic customer queries (balanced across 7 agents)
- **Training Time:** 45 minutes on Google Colab T4 GPU
- **Framework:** Hugging Face Transformers + PEFT + TRL
- **Quantization:** 4-bit NF4 during training
- **Optimizer:** paged_adamw_8bit
- **Learning Rate:** 2e-4
- **Epochs:** 3
- **Batch Size:** 4 (effective 16 with gradient accumulation)
## Intended Use
### Primary Use Case
**Multi-agent customer support routing** for e-commerce platforms:
- Route queries to order management, product search, returns, payments, account, technical support agents
- Maintain conversation context across multi-turn interactions
- Enable intelligent task switching
### Supported Agents
The model routes queries to 7 specialized agents:
1. **Order Management** (`route_to_order_agent`) - Track orders, update delivery, cancel orders
2. **Product Search** (`route_to_search_agent`) - Search catalog, check availability, recommendations
3. **Product Details** (`route_to_details_agent`) - Specifications, reviews, comparisons
4. **Returns & Refunds** (`route_to_returns_agent`) - Initiate returns, process refunds, exchanges
5. **Account Management** (`route_to_account_agent`) - Update profile, manage addresses, security
6. **Payment Support** (`route_to_payment_agent`) - Resolve payment issues, update methods, billing
7. **Technical Support** (`route_to_technical_agent`) - Fix app/website issues, login problems
### Out-of-Scope Use
- β General-purpose chatbot (use base Gemma models instead)
- β Direct dialogue generation (this is a routing model)
- β More than 20 agents (context window limitations)
- β Non-customer-support domains without fine-tuning
## Performance
### Test Set Results
```
Overall Accuracy: 89.40% (1,684/1,883 correct)
Per-Agent Performance:
order_management 92.3% (251/272)
product_search 91.1% (257/282)
product_details 94.7% (233/246)
returns_refunds 88.2% (238/270)
account_management 85.1% (229/269)
payment_support 89.5% (241/269)
technical_support 87.0% (234/269)
```
### Comparison to Baselines
| Approach | Accuracy | Latency | Memory |
|----------|----------|---------|--------|
| Keyword Matching | 52-58% | 5ms | Negligible |
| Rule-based (100 rules) | 65-70% | 8ms | Negligible |
| BERT Classifier (300M) | 82-85% | 45ms | 400 MB |
| **This Model (LoRA)** | **89.4%** | **127ms** | **2.1 GB** |
| GPT-4 API (zero-shot) | 85-90% | 2500ms | Cloud |
### Latency Breakdown (T4 GPU)
- **Routing Decision:** 127ms average
- **Agent Execution:** ~52ms average
- **Total End-to-End:** ~179ms average
## How to Use
### Installation
```bash
pip install transformers peft torch accelerate bitsandbytes
```
### Quick Start
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"google/functiongemma-270m-it",
device_map="auto",
torch_dtype=torch.bfloat16
)
# Load LoRA adapters
model = PeftModel.from_pretrained(
base_model,
"scionoftech/functiongemma-270m-ecommerce-router"
)
tokenizer = AutoTokenizer.from_pretrained("google/functiongemma-270m-it")
# Define available agents
agent_declarations = """<start_function_declaration>
route_to_order_agent(): Track, update, or cancel customer orders
route_to_search_agent(): Search products, check availability
route_to_details_agent(): Get product specifications and reviews
route_to_returns_agent(): Handle returns, refunds, exchanges
route_to_account_agent(): Manage user profile and settings
route_to_payment_agent(): Resolve payment and billing issues
route_to_technical_agent(): Fix app, website, login issues
<end_function_declaration>"""
# Route a query
query = "Where is my order?"
prompt = f"""<start_of_turn>user
{agent_declarations}
User query: {query}<end_of_turn>
<start_of_turn>model
"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=30,
do_sample=False,
pad_token_id=tokenizer.eos_token_id
)
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=False)
print(response)
# Output: <function_call>route_to_order_agent</function_call>
```
### Production Deployment (4-bit Quantization)
```python
from transformers import AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
# 4-bit quantization config
quant_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16
)
# Load with quantization
base_model = AutoModelForCausalLM.from_pretrained(
"google/functiongemma-270m-it",
quantization_config=quant_config,
device_map="auto"
)
model = PeftModel.from_pretrained(
base_model,
"scionoftech/functiongemma-270m-ecommerce-router"
)
# Result: 180 MB model, 132ms latency, 89.1% accuracy
```
### Parsing Function Calls
```python
import re
def extract_agent_function(response: str) -> str:
"""Extract function name from FunctionGemma output."""
match = re.search(r'<function_call>([a-zA-Z_]+)</function_call>', response)
return match.group(1) if match else "unknown"
# Usage
agent = extract_agent_function(response)
print(f"Route to: {agent}")
# Output: Route to: route_to_order_agent
```
## Training Procedure
### Dataset Preparation
Generated 12,550 synthetic examples with linguistic variations:
```python
# Example training format
{
"query": "Please track my package ASAP",
"function": "route_to_order_agent",
"agent": "order_management"
}
```
Variations included:
- Polite forms: "Please", "Could you", "Can you"
- Casual starters: "Hey", "Hi", "Um"
- Urgency markers: "ASAP", "urgently", "immediately"
- Edge cases and ambiguous queries
### Training Configuration
```python
from transformers import TrainingArguments
from trl import SFTTrainer
from peft import LoraConfig
# LoRA config
lora_config = LoraConfig(
r=16,
lora_alpha=32,
target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
lora_dropout=0.05,
bias="none",
task_type="CAUSAL_LM"
)
# Training args
training_args = TrainingArguments(
output_dir="./functiongemma-ecommerce-router",
num_train_epochs=3,
per_device_train_batch_size=4,
gradient_accumulation_steps=4,
learning_rate=2e-4,
lr_scheduler_type="cosine",
warmup_ratio=0.1,
weight_decay=0.01,
bf16=True,
optim="paged_adamw_8bit",
logging_steps=20,
eval_strategy="epoch",
save_strategy="epoch"
)
```
### Training Results
- **Final Training Loss:** 0.0182
- **Final Validation Loss:** 0.0198
- **Training Time:** 45 minutes (T4 GPU)
- **Peak Memory:** 11.2 GB
## Limitations and Biases
### Known Limitations
1. **Ambiguous Queries:** 10.6% error rate concentrated in genuinely ambiguous queries
- Example: "I need help" (could be any agent)
- Mitigation: Implement confidence-based clarification (confidence < 0.7)
2. **Context Dependency:** Requires conversation state management for multi-turn interactions
- Solution: Use durable workflow orchestrators (Temporal, Cadence)
3. **Agent Confusion:** Most common misclassifications:
- Returns β Order Management (12 cases)
- Account β Payment (8 cases)
- Technical β Product Details (6 cases)
4. **Language:** Trained only on English queries
- For multilingual support, fine-tune on translated datasets
### Biases
- **Domain-Specific:** Trained exclusively on e-commerce customer support
- **Synthetic Data:** Generated examples may not capture all real-world variations
- **Agent Distribution:** Balanced training may not reflect real query distributions
## Ethical Considerations
- **Misrouting Impact:** Incorrect routing may frustrate customers or delay issue resolution
- **Recommendation:** Implement fallback to human agents for low-confidence predictions
- **Privacy:** Model doesn't store user data; conversation state managed externally
- **Fairness:** Ensure equal routing performance across user demographics
## Citation
If you use this model in your research or production systems, please cite:
```bibtex
@misc{functiongemma-ecommerce-router,
author = {Sai Kumar Yava},
title = {FunctionGemma 270M Fine-tuned for E-Commerce Multi-Agent Routing},
year = {2025},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/scionoftech/functiongemma-270m-ecommerce-router}},
}
@article{functiongemma2025,
title={FunctionGemma: Bringing bespoke function calling to the edge},
author={Google DeepMind},
year={2025},
url={https://blog.google/technology/developers/functiongemma/}
}
```
## Acknowledgments
- Google DeepMind for FunctionGemma base model
- Hugging Face for PEFT and Transformers libraries
- The open-source AI community
## License
This model inherits the Gemma license from the base model. See [Gemma Terms of Use](https://ai.google.dev/gemma/terms).
**Commercial Use:** Permitted under Gemma license terms.
## Related Resources
- **Training Notebook:** [Google Colab](https://colab.research.google.com/github/scionoftech/functiongemma-finetuning-e-commerce/blob/main/FunctionGemma_fine_tuning.ipynb)
- **GitHub Repository:** [Complete code](https://github.com/scionoftech/functiongemma-finetuning-e-commerce)
- **Dataset:** [Training data](https://huggingface.co/datasets/scionoftech/functiongemma-e-commerce-dataset)
- **Base Model:** [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it)
## Updates
- **2025-12-25:** Initial release - 89.4% routing accuracy on e-commerce customer support
---
**Questions?** Open an issue on [GitHub](https://github.com/scionoftech/functiongemma-finetuning-e-commerce/issues) |