scionoftech's picture
Upload 10 files
a57226e verified
---
language:
- en
license: gemma
library_name: transformers
tags:
- function-calling
- agent-routing
- multi-agent
- lora
- peft
- gemma
- functiongemma
- customer-support
- e-commerce
base_model: google/functiongemma-270m-it
datasets:
- scionoftech/functiongemma-e-commerce-dataset
model-index:
- name: functiongemma-270m-ecommerce-router
results:
- task:
type: text-classification
name: Agent Routing
dataset:
name: E-commerce Customer Support Routing
type: scionoftech/ecommerce-agent-routing
metrics:
- type: accuracy
value: 89.4
name: Routing Accuracy
- type: f1
value: 89.0
name: Macro F1 Score
---
# FunctionGemma 270M - E-Commerce Multi-Agent Router
Fine-tuned version of [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it) for intelligent routing of customer queries across 7 specialized agents in e-commerce customer support systems.
## Model Description
This model demonstrates how FunctionGemma can be adapted beyond mobile actions for **multi-agent orchestration** in enterprise systems. It intelligently routes natural language customer queries to the appropriate specialized agent with **89.4% accuracy**.
**Key Achievement:** Replacing brittle rule-based routing (52-58% accuracy) with learned intelligence using only 1.47M trainable parameters (0.55% of the model).
### Architecture
- **Base Model:** google/functiongemma-270m-it (270M parameters)
- **Fine-tuning Method:** LoRA (Low-Rank Adaptation)
- **Trainable Parameters:** 1,474,560 (0.55%)
- **LoRA Rank:** 16
- **LoRA Alpha:** 32
- **Target Modules:** q_proj, k_proj, v_proj, o_proj
### Training Details
- **Training Data:** 12,550 synthetic customer queries (balanced across 7 agents)
- **Training Time:** 45 minutes on Google Colab T4 GPU
- **Framework:** Hugging Face Transformers + PEFT + TRL
- **Quantization:** 4-bit NF4 during training
- **Optimizer:** paged_adamw_8bit
- **Learning Rate:** 2e-4
- **Epochs:** 3
- **Batch Size:** 4 (effective 16 with gradient accumulation)
## Intended Use
### Primary Use Case
**Multi-agent customer support routing** for e-commerce platforms:
- Route queries to order management, product search, returns, payments, account, technical support agents
- Maintain conversation context across multi-turn interactions
- Enable intelligent task switching
### Supported Agents
The model routes queries to 7 specialized agents:
1. **Order Management** (`route_to_order_agent`) - Track orders, update delivery, cancel orders
2. **Product Search** (`route_to_search_agent`) - Search catalog, check availability, recommendations
3. **Product Details** (`route_to_details_agent`) - Specifications, reviews, comparisons
4. **Returns & Refunds** (`route_to_returns_agent`) - Initiate returns, process refunds, exchanges
5. **Account Management** (`route_to_account_agent`) - Update profile, manage addresses, security
6. **Payment Support** (`route_to_payment_agent`) - Resolve payment issues, update methods, billing
7. **Technical Support** (`route_to_technical_agent`) - Fix app/website issues, login problems
### Out-of-Scope Use
- ❌ General-purpose chatbot (use base Gemma models instead)
- ❌ Direct dialogue generation (this is a routing model)
- ❌ More than 20 agents (context window limitations)
- ❌ Non-customer-support domains without fine-tuning
## Performance
### Test Set Results
```
Overall Accuracy: 89.40% (1,684/1,883 correct)
Per-Agent Performance:
order_management 92.3% (251/272)
product_search 91.1% (257/282)
product_details 94.7% (233/246)
returns_refunds 88.2% (238/270)
account_management 85.1% (229/269)
payment_support 89.5% (241/269)
technical_support 87.0% (234/269)
```
### Comparison to Baselines
| Approach | Accuracy | Latency | Memory |
|----------|----------|---------|--------|
| Keyword Matching | 52-58% | 5ms | Negligible |
| Rule-based (100 rules) | 65-70% | 8ms | Negligible |
| BERT Classifier (300M) | 82-85% | 45ms | 400 MB |
| **This Model (LoRA)** | **89.4%** | **127ms** | **2.1 GB** |
| GPT-4 API (zero-shot) | 85-90% | 2500ms | Cloud |
### Latency Breakdown (T4 GPU)
- **Routing Decision:** 127ms average
- **Agent Execution:** ~52ms average
- **Total End-to-End:** ~179ms average
## How to Use
### Installation
```bash
pip install transformers peft torch accelerate bitsandbytes
```
### Quick Start
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"google/functiongemma-270m-it",
device_map="auto",
torch_dtype=torch.bfloat16
)
# Load LoRA adapters
model = PeftModel.from_pretrained(
base_model,
"scionoftech/functiongemma-270m-ecommerce-router"
)
tokenizer = AutoTokenizer.from_pretrained("google/functiongemma-270m-it")
# Define available agents
agent_declarations = """<start_function_declaration>
route_to_order_agent(): Track, update, or cancel customer orders
route_to_search_agent(): Search products, check availability
route_to_details_agent(): Get product specifications and reviews
route_to_returns_agent(): Handle returns, refunds, exchanges
route_to_account_agent(): Manage user profile and settings
route_to_payment_agent(): Resolve payment and billing issues
route_to_technical_agent(): Fix app, website, login issues
<end_function_declaration>"""
# Route a query
query = "Where is my order?"
prompt = f"""<start_of_turn>user
{agent_declarations}
User query: {query}<end_of_turn>
<start_of_turn>model
"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=30,
do_sample=False,
pad_token_id=tokenizer.eos_token_id
)
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=False)
print(response)
# Output: <function_call>route_to_order_agent</function_call>
```
### Production Deployment (4-bit Quantization)
```python
from transformers import AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
# 4-bit quantization config
quant_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16
)
# Load with quantization
base_model = AutoModelForCausalLM.from_pretrained(
"google/functiongemma-270m-it",
quantization_config=quant_config,
device_map="auto"
)
model = PeftModel.from_pretrained(
base_model,
"scionoftech/functiongemma-270m-ecommerce-router"
)
# Result: 180 MB model, 132ms latency, 89.1% accuracy
```
### Parsing Function Calls
```python
import re
def extract_agent_function(response: str) -> str:
"""Extract function name from FunctionGemma output."""
match = re.search(r'<function_call>([a-zA-Z_]+)</function_call>', response)
return match.group(1) if match else "unknown"
# Usage
agent = extract_agent_function(response)
print(f"Route to: {agent}")
# Output: Route to: route_to_order_agent
```
## Training Procedure
### Dataset Preparation
Generated 12,550 synthetic examples with linguistic variations:
```python
# Example training format
{
"query": "Please track my package ASAP",
"function": "route_to_order_agent",
"agent": "order_management"
}
```
Variations included:
- Polite forms: "Please", "Could you", "Can you"
- Casual starters: "Hey", "Hi", "Um"
- Urgency markers: "ASAP", "urgently", "immediately"
- Edge cases and ambiguous queries
### Training Configuration
```python
from transformers import TrainingArguments
from trl import SFTTrainer
from peft import LoraConfig
# LoRA config
lora_config = LoraConfig(
r=16,
lora_alpha=32,
target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
lora_dropout=0.05,
bias="none",
task_type="CAUSAL_LM"
)
# Training args
training_args = TrainingArguments(
output_dir="./functiongemma-ecommerce-router",
num_train_epochs=3,
per_device_train_batch_size=4,
gradient_accumulation_steps=4,
learning_rate=2e-4,
lr_scheduler_type="cosine",
warmup_ratio=0.1,
weight_decay=0.01,
bf16=True,
optim="paged_adamw_8bit",
logging_steps=20,
eval_strategy="epoch",
save_strategy="epoch"
)
```
### Training Results
- **Final Training Loss:** 0.0182
- **Final Validation Loss:** 0.0198
- **Training Time:** 45 minutes (T4 GPU)
- **Peak Memory:** 11.2 GB
## Limitations and Biases
### Known Limitations
1. **Ambiguous Queries:** 10.6% error rate concentrated in genuinely ambiguous queries
- Example: "I need help" (could be any agent)
- Mitigation: Implement confidence-based clarification (confidence < 0.7)
2. **Context Dependency:** Requires conversation state management for multi-turn interactions
- Solution: Use durable workflow orchestrators (Temporal, Cadence)
3. **Agent Confusion:** Most common misclassifications:
- Returns ↔ Order Management (12 cases)
- Account ↔ Payment (8 cases)
- Technical ↔ Product Details (6 cases)
4. **Language:** Trained only on English queries
- For multilingual support, fine-tune on translated datasets
### Biases
- **Domain-Specific:** Trained exclusively on e-commerce customer support
- **Synthetic Data:** Generated examples may not capture all real-world variations
- **Agent Distribution:** Balanced training may not reflect real query distributions
## Ethical Considerations
- **Misrouting Impact:** Incorrect routing may frustrate customers or delay issue resolution
- **Recommendation:** Implement fallback to human agents for low-confidence predictions
- **Privacy:** Model doesn't store user data; conversation state managed externally
- **Fairness:** Ensure equal routing performance across user demographics
## Citation
If you use this model in your research or production systems, please cite:
```bibtex
@misc{functiongemma-ecommerce-router,
author = {Sai Kumar Yava},
title = {FunctionGemma 270M Fine-tuned for E-Commerce Multi-Agent Routing},
year = {2025},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/scionoftech/functiongemma-270m-ecommerce-router}},
}
@article{functiongemma2025,
title={FunctionGemma: Bringing bespoke function calling to the edge},
author={Google DeepMind},
year={2025},
url={https://blog.google/technology/developers/functiongemma/}
}
```
## Acknowledgments
- Google DeepMind for FunctionGemma base model
- Hugging Face for PEFT and Transformers libraries
- The open-source AI community
## License
This model inherits the Gemma license from the base model. See [Gemma Terms of Use](https://ai.google.dev/gemma/terms).
**Commercial Use:** Permitted under Gemma license terms.
## Related Resources
- **Training Notebook:** [Google Colab](https://colab.research.google.com/github/scionoftech/functiongemma-finetuning-e-commerce/blob/main/FunctionGemma_fine_tuning.ipynb)
- **GitHub Repository:** [Complete code](https://github.com/scionoftech/functiongemma-finetuning-e-commerce)
- **Dataset:** [Training data](https://huggingface.co/datasets/scionoftech/functiongemma-e-commerce-dataset)
- **Base Model:** [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it)
## Updates
- **2025-12-25:** Initial release - 89.4% routing accuracy on e-commerce customer support
---
**Questions?** Open an issue on [GitHub](https://github.com/scionoftech/functiongemma-finetuning-e-commerce/issues)