|
|
--- |
|
|
language: |
|
|
- en |
|
|
license: gemma |
|
|
library_name: transformers |
|
|
tags: |
|
|
- function-calling |
|
|
- agent-routing |
|
|
- multi-agent |
|
|
- lora |
|
|
- peft |
|
|
- gemma |
|
|
- functiongemma |
|
|
- customer-support |
|
|
- e-commerce |
|
|
base_model: google/functiongemma-270m-it |
|
|
datasets: |
|
|
- scionoftech/functiongemma-e-commerce-dataset |
|
|
model-index: |
|
|
- name: functiongemma-270m-ecommerce-router |
|
|
results: |
|
|
- task: |
|
|
type: text-classification |
|
|
name: Agent Routing |
|
|
dataset: |
|
|
name: E-commerce Customer Support Routing |
|
|
type: scionoftech/ecommerce-agent-routing |
|
|
metrics: |
|
|
- type: accuracy |
|
|
value: 89.4 |
|
|
name: Routing Accuracy |
|
|
- type: f1 |
|
|
value: 89.0 |
|
|
name: Macro F1 Score |
|
|
--- |
|
|
|
|
|
# FunctionGemma 270M - E-Commerce Multi-Agent Router |
|
|
|
|
|
Fine-tuned version of [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it) for intelligent routing of customer queries across 7 specialized agents in e-commerce customer support systems. |
|
|
|
|
|
## Model Description |
|
|
|
|
|
This model demonstrates how FunctionGemma can be adapted beyond mobile actions for **multi-agent orchestration** in enterprise systems. It intelligently routes natural language customer queries to the appropriate specialized agent with **89.4% accuracy**. |
|
|
|
|
|
**Key Achievement:** Replacing brittle rule-based routing (52-58% accuracy) with learned intelligence using only 1.47M trainable parameters (0.55% of the model). |
|
|
|
|
|
### Architecture |
|
|
|
|
|
- **Base Model:** google/functiongemma-270m-it (270M parameters) |
|
|
- **Fine-tuning Method:** LoRA (Low-Rank Adaptation) |
|
|
- **Trainable Parameters:** 1,474,560 (0.55%) |
|
|
- **LoRA Rank:** 16 |
|
|
- **LoRA Alpha:** 32 |
|
|
- **Target Modules:** q_proj, k_proj, v_proj, o_proj |
|
|
|
|
|
### Training Details |
|
|
|
|
|
- **Training Data:** 12,550 synthetic customer queries (balanced across 7 agents) |
|
|
- **Training Time:** 45 minutes on Google Colab T4 GPU |
|
|
- **Framework:** Hugging Face Transformers + PEFT + TRL |
|
|
- **Quantization:** 4-bit NF4 during training |
|
|
- **Optimizer:** paged_adamw_8bit |
|
|
- **Learning Rate:** 2e-4 |
|
|
- **Epochs:** 3 |
|
|
- **Batch Size:** 4 (effective 16 with gradient accumulation) |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
### Primary Use Case |
|
|
**Multi-agent customer support routing** for e-commerce platforms: |
|
|
- Route queries to order management, product search, returns, payments, account, technical support agents |
|
|
- Maintain conversation context across multi-turn interactions |
|
|
- Enable intelligent task switching |
|
|
|
|
|
### Supported Agents |
|
|
|
|
|
The model routes queries to 7 specialized agents: |
|
|
|
|
|
1. **Order Management** (`route_to_order_agent`) - Track orders, update delivery, cancel orders |
|
|
2. **Product Search** (`route_to_search_agent`) - Search catalog, check availability, recommendations |
|
|
3. **Product Details** (`route_to_details_agent`) - Specifications, reviews, comparisons |
|
|
4. **Returns & Refunds** (`route_to_returns_agent`) - Initiate returns, process refunds, exchanges |
|
|
5. **Account Management** (`route_to_account_agent`) - Update profile, manage addresses, security |
|
|
6. **Payment Support** (`route_to_payment_agent`) - Resolve payment issues, update methods, billing |
|
|
7. **Technical Support** (`route_to_technical_agent`) - Fix app/website issues, login problems |
|
|
|
|
|
### Out-of-Scope Use |
|
|
|
|
|
- ❌ General-purpose chatbot (use base Gemma models instead) |
|
|
- ❌ Direct dialogue generation (this is a routing model) |
|
|
- ❌ More than 20 agents (context window limitations) |
|
|
- ❌ Non-customer-support domains without fine-tuning |
|
|
|
|
|
## Performance |
|
|
|
|
|
### Test Set Results |
|
|
|
|
|
``` |
|
|
Overall Accuracy: 89.40% (1,684/1,883 correct) |
|
|
|
|
|
Per-Agent Performance: |
|
|
order_management 92.3% (251/272) |
|
|
product_search 91.1% (257/282) |
|
|
product_details 94.7% (233/246) |
|
|
returns_refunds 88.2% (238/270) |
|
|
account_management 85.1% (229/269) |
|
|
payment_support 89.5% (241/269) |
|
|
technical_support 87.0% (234/269) |
|
|
``` |
|
|
|
|
|
### Comparison to Baselines |
|
|
|
|
|
| Approach | Accuracy | Latency | Memory | |
|
|
|----------|----------|---------|--------| |
|
|
| Keyword Matching | 52-58% | 5ms | Negligible | |
|
|
| Rule-based (100 rules) | 65-70% | 8ms | Negligible | |
|
|
| BERT Classifier (300M) | 82-85% | 45ms | 400 MB | |
|
|
| **This Model (LoRA)** | **89.4%** | **127ms** | **2.1 GB** | |
|
|
| GPT-4 API (zero-shot) | 85-90% | 2500ms | Cloud | |
|
|
|
|
|
### Latency Breakdown (T4 GPU) |
|
|
|
|
|
- **Routing Decision:** 127ms average |
|
|
- **Agent Execution:** ~52ms average |
|
|
- **Total End-to-End:** ~179ms average |
|
|
|
|
|
## How to Use |
|
|
|
|
|
### Installation |
|
|
|
|
|
```bash |
|
|
pip install transformers peft torch accelerate bitsandbytes |
|
|
``` |
|
|
|
|
|
### Quick Start |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
from peft import PeftModel |
|
|
import torch |
|
|
|
|
|
# Load base model |
|
|
base_model = AutoModelForCausalLM.from_pretrained( |
|
|
"google/functiongemma-270m-it", |
|
|
device_map="auto", |
|
|
torch_dtype=torch.bfloat16 |
|
|
) |
|
|
|
|
|
# Load LoRA adapters |
|
|
model = PeftModel.from_pretrained( |
|
|
base_model, |
|
|
"scionoftech/functiongemma-270m-ecommerce-router" |
|
|
) |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained("google/functiongemma-270m-it") |
|
|
|
|
|
# Define available agents |
|
|
agent_declarations = """<start_function_declaration> |
|
|
route_to_order_agent(): Track, update, or cancel customer orders |
|
|
route_to_search_agent(): Search products, check availability |
|
|
route_to_details_agent(): Get product specifications and reviews |
|
|
route_to_returns_agent(): Handle returns, refunds, exchanges |
|
|
route_to_account_agent(): Manage user profile and settings |
|
|
route_to_payment_agent(): Resolve payment and billing issues |
|
|
route_to_technical_agent(): Fix app, website, login issues |
|
|
<end_function_declaration>""" |
|
|
|
|
|
# Route a query |
|
|
query = "Where is my order?" |
|
|
|
|
|
prompt = f"""<start_of_turn>user |
|
|
{agent_declarations} |
|
|
|
|
|
User query: {query}<end_of_turn> |
|
|
<start_of_turn>model |
|
|
""" |
|
|
|
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
|
|
|
with torch.no_grad(): |
|
|
outputs = model.generate( |
|
|
**inputs, |
|
|
max_new_tokens=30, |
|
|
do_sample=False, |
|
|
pad_token_id=tokenizer.eos_token_id |
|
|
) |
|
|
|
|
|
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=False) |
|
|
print(response) |
|
|
# Output: <function_call>route_to_order_agent</function_call> |
|
|
``` |
|
|
|
|
|
### Production Deployment (4-bit Quantization) |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, BitsAndBytesConfig |
|
|
from peft import PeftModel |
|
|
|
|
|
# 4-bit quantization config |
|
|
quant_config = BitsAndBytesConfig( |
|
|
load_in_4bit=True, |
|
|
bnb_4bit_quant_type="nf4", |
|
|
bnb_4bit_compute_dtype=torch.bfloat16 |
|
|
) |
|
|
|
|
|
# Load with quantization |
|
|
base_model = AutoModelForCausalLM.from_pretrained( |
|
|
"google/functiongemma-270m-it", |
|
|
quantization_config=quant_config, |
|
|
device_map="auto" |
|
|
) |
|
|
|
|
|
model = PeftModel.from_pretrained( |
|
|
base_model, |
|
|
"scionoftech/functiongemma-270m-ecommerce-router" |
|
|
) |
|
|
|
|
|
# Result: 180 MB model, 132ms latency, 89.1% accuracy |
|
|
``` |
|
|
|
|
|
### Parsing Function Calls |
|
|
|
|
|
```python |
|
|
import re |
|
|
|
|
|
def extract_agent_function(response: str) -> str: |
|
|
"""Extract function name from FunctionGemma output.""" |
|
|
match = re.search(r'<function_call>([a-zA-Z_]+)</function_call>', response) |
|
|
return match.group(1) if match else "unknown" |
|
|
|
|
|
# Usage |
|
|
agent = extract_agent_function(response) |
|
|
print(f"Route to: {agent}") |
|
|
# Output: Route to: route_to_order_agent |
|
|
``` |
|
|
|
|
|
## Training Procedure |
|
|
|
|
|
### Dataset Preparation |
|
|
|
|
|
Generated 12,550 synthetic examples with linguistic variations: |
|
|
|
|
|
```python |
|
|
# Example training format |
|
|
{ |
|
|
"query": "Please track my package ASAP", |
|
|
"function": "route_to_order_agent", |
|
|
"agent": "order_management" |
|
|
} |
|
|
``` |
|
|
|
|
|
Variations included: |
|
|
- Polite forms: "Please", "Could you", "Can you" |
|
|
- Casual starters: "Hey", "Hi", "Um" |
|
|
- Urgency markers: "ASAP", "urgently", "immediately" |
|
|
- Edge cases and ambiguous queries |
|
|
|
|
|
### Training Configuration |
|
|
|
|
|
```python |
|
|
from transformers import TrainingArguments |
|
|
from trl import SFTTrainer |
|
|
from peft import LoraConfig |
|
|
|
|
|
# LoRA config |
|
|
lora_config = LoraConfig( |
|
|
r=16, |
|
|
lora_alpha=32, |
|
|
target_modules=["q_proj", "k_proj", "v_proj", "o_proj"], |
|
|
lora_dropout=0.05, |
|
|
bias="none", |
|
|
task_type="CAUSAL_LM" |
|
|
) |
|
|
|
|
|
# Training args |
|
|
training_args = TrainingArguments( |
|
|
output_dir="./functiongemma-ecommerce-router", |
|
|
num_train_epochs=3, |
|
|
per_device_train_batch_size=4, |
|
|
gradient_accumulation_steps=4, |
|
|
learning_rate=2e-4, |
|
|
lr_scheduler_type="cosine", |
|
|
warmup_ratio=0.1, |
|
|
weight_decay=0.01, |
|
|
bf16=True, |
|
|
optim="paged_adamw_8bit", |
|
|
logging_steps=20, |
|
|
eval_strategy="epoch", |
|
|
save_strategy="epoch" |
|
|
) |
|
|
``` |
|
|
|
|
|
### Training Results |
|
|
|
|
|
- **Final Training Loss:** 0.0182 |
|
|
- **Final Validation Loss:** 0.0198 |
|
|
- **Training Time:** 45 minutes (T4 GPU) |
|
|
- **Peak Memory:** 11.2 GB |
|
|
|
|
|
## Limitations and Biases |
|
|
|
|
|
### Known Limitations |
|
|
|
|
|
1. **Ambiguous Queries:** 10.6% error rate concentrated in genuinely ambiguous queries |
|
|
- Example: "I need help" (could be any agent) |
|
|
- Mitigation: Implement confidence-based clarification (confidence < 0.7) |
|
|
|
|
|
2. **Context Dependency:** Requires conversation state management for multi-turn interactions |
|
|
- Solution: Use durable workflow orchestrators (Temporal, Cadence) |
|
|
|
|
|
3. **Agent Confusion:** Most common misclassifications: |
|
|
- Returns ↔ Order Management (12 cases) |
|
|
- Account ↔ Payment (8 cases) |
|
|
- Technical ↔ Product Details (6 cases) |
|
|
|
|
|
4. **Language:** Trained only on English queries |
|
|
- For multilingual support, fine-tune on translated datasets |
|
|
|
|
|
### Biases |
|
|
|
|
|
- **Domain-Specific:** Trained exclusively on e-commerce customer support |
|
|
- **Synthetic Data:** Generated examples may not capture all real-world variations |
|
|
- **Agent Distribution:** Balanced training may not reflect real query distributions |
|
|
|
|
|
## Ethical Considerations |
|
|
|
|
|
- **Misrouting Impact:** Incorrect routing may frustrate customers or delay issue resolution |
|
|
- **Recommendation:** Implement fallback to human agents for low-confidence predictions |
|
|
- **Privacy:** Model doesn't store user data; conversation state managed externally |
|
|
- **Fairness:** Ensure equal routing performance across user demographics |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model in your research or production systems, please cite: |
|
|
|
|
|
```bibtex |
|
|
@misc{functiongemma-ecommerce-router, |
|
|
author = {Sai Kumar Yava}, |
|
|
title = {FunctionGemma 270M Fine-tuned for E-Commerce Multi-Agent Routing}, |
|
|
year = {2025}, |
|
|
publisher = {HuggingFace}, |
|
|
howpublished = {\url{https://huggingface.co/scionoftech/functiongemma-270m-ecommerce-router}}, |
|
|
} |
|
|
|
|
|
@article{functiongemma2025, |
|
|
title={FunctionGemma: Bringing bespoke function calling to the edge}, |
|
|
author={Google DeepMind}, |
|
|
year={2025}, |
|
|
url={https://blog.google/technology/developers/functiongemma/} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Acknowledgments |
|
|
|
|
|
- Google DeepMind for FunctionGemma base model |
|
|
- Hugging Face for PEFT and Transformers libraries |
|
|
- The open-source AI community |
|
|
|
|
|
## License |
|
|
|
|
|
This model inherits the Gemma license from the base model. See [Gemma Terms of Use](https://ai.google.dev/gemma/terms). |
|
|
|
|
|
**Commercial Use:** Permitted under Gemma license terms. |
|
|
|
|
|
## Related Resources |
|
|
|
|
|
- **Training Notebook:** [Google Colab](https://colab.research.google.com/github/scionoftech/functiongemma-finetuning-e-commerce/blob/main/FunctionGemma_fine_tuning.ipynb) |
|
|
- **GitHub Repository:** [Complete code](https://github.com/scionoftech/functiongemma-finetuning-e-commerce) |
|
|
- **Dataset:** [Training data](https://huggingface.co/datasets/scionoftech/functiongemma-e-commerce-dataset) |
|
|
- **Base Model:** [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it) |
|
|
|
|
|
## Updates |
|
|
|
|
|
- **2025-12-25:** Initial release - 89.4% routing accuracy on e-commerce customer support |
|
|
|
|
|
--- |
|
|
|
|
|
**Questions?** Open an issue on [GitHub](https://github.com/scionoftech/functiongemma-finetuning-e-commerce/issues) |