File size: 11,537 Bytes

---
language:
- en
license: gemma
library_name: transformers
tags:
- function-calling
- agent-routing
- multi-agent
- lora
- peft
- gemma
- functiongemma
- customer-support
- e-commerce
base_model: google/functiongemma-270m-it
datasets:
- scionoftech/functiongemma-e-commerce-dataset
model-index:
- name: functiongemma-270m-ecommerce-router
  results:
  - task:
      type: text-classification
      name: Agent Routing
    dataset:
      name: E-commerce Customer Support Routing
      type: scionoftech/ecommerce-agent-routing
    metrics:
    - type: accuracy
      value: 89.4
      name: Routing Accuracy
    - type: f1
      value: 89.0
      name: Macro F1 Score
---

# FunctionGemma 270M - E-Commerce Multi-Agent Router

Fine-tuned version of [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it) for intelligent routing of customer queries across 7 specialized agents in e-commerce customer support systems.

## Model Description

This model demonstrates how FunctionGemma can be adapted beyond mobile actions for **multi-agent orchestration** in enterprise systems. It intelligently routes natural language customer queries to the appropriate specialized agent with **89.4% accuracy**.

**Key Achievement:** Replacing brittle rule-based routing (52-58% accuracy) with learned intelligence using only 1.47M trainable parameters (0.55% of the model).

### Architecture

- **Base Model:** google/functiongemma-270m-it (270M parameters)
- **Fine-tuning Method:** LoRA (Low-Rank Adaptation)
- **Trainable Parameters:** 1,474,560 (0.55%)
- **LoRA Rank:** 16
- **LoRA Alpha:** 32
- **Target Modules:** q_proj, k_proj, v_proj, o_proj

### Training Details

- **Training Data:** 12,550 synthetic customer queries (balanced across 7 agents)
- **Training Time:** 45 minutes on Google Colab T4 GPU
- **Framework:** Hugging Face Transformers + PEFT + TRL
- **Quantization:** 4-bit NF4 during training
- **Optimizer:** paged_adamw_8bit
- **Learning Rate:** 2e-4
- **Epochs:** 3
- **Batch Size:** 4 (effective 16 with gradient accumulation)

## Intended Use

### Primary Use Case
**Multi-agent customer support routing** for e-commerce platforms:
- Route queries to order management, product search, returns, payments, account, technical support agents
- Maintain conversation context across multi-turn interactions
- Enable intelligent task switching

### Supported Agents

The model routes queries to 7 specialized agents:

1. **Order Management** (`route_to_order_agent`) - Track orders, update delivery, cancel orders
2. **Product Search** (`route_to_search_agent`) - Search catalog, check availability, recommendations
3. **Product Details** (`route_to_details_agent`) - Specifications, reviews, comparisons
4. **Returns & Refunds** (`route_to_returns_agent`) - Initiate returns, process refunds, exchanges
5. **Account Management** (`route_to_account_agent`) - Update profile, manage addresses, security
6. **Payment Support** (`route_to_payment_agent`) - Resolve payment issues, update methods, billing
7. **Technical Support** (`route_to_technical_agent`) - Fix app/website issues, login problems

### Out-of-Scope Use

- ❌ General-purpose chatbot (use base Gemma models instead)
- ❌ Direct dialogue generation (this is a routing model)
- ❌ More than 20 agents (context window limitations)
- ❌ Non-customer-support domains without fine-tuning

## Performance

### Test Set Results

```
Overall Accuracy: 89.40% (1,684/1,883 correct)

Per-Agent Performance:
  order_management      92.3%  (251/272)
  product_search        91.1%  (257/282)
  product_details       94.7%  (233/246)
  returns_refunds       88.2%  (238/270)
  account_management    85.1%  (229/269)
  payment_support       89.5%  (241/269)
  technical_support     87.0%  (234/269)
```

### Comparison to Baselines

| Approach | Accuracy | Latency | Memory |
|----------|----------|---------|--------|
| Keyword Matching | 52-58% | 5ms | Negligible |
| Rule-based (100 rules) | 65-70% | 8ms | Negligible |
| BERT Classifier (300M) | 82-85% | 45ms | 400 MB |
| **This Model (LoRA)** | **89.4%** | **127ms** | **2.1 GB** |
| GPT-4 API (zero-shot) | 85-90% | 2500ms | Cloud |

### Latency Breakdown (T4 GPU)

- **Routing Decision:** 127ms average
- **Agent Execution:** ~52ms average
- **Total End-to-End:** ~179ms average

## How to Use

### Installation

```bash
pip install transformers peft torch accelerate bitsandbytes
```

### Quick Start

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "google/functiongemma-270m-it",
    device_map="auto",
    torch_dtype=torch.bfloat16
)

# Load LoRA adapters
model = PeftModel.from_pretrained(
    base_model,
    "scionoftech/functiongemma-270m-ecommerce-router"
)

tokenizer = AutoTokenizer.from_pretrained("google/functiongemma-270m-it")

# Define available agents
agent_declarations = """<start_function_declaration>
route_to_order_agent(): Track, update, or cancel customer orders
route_to_search_agent(): Search products, check availability
route_to_details_agent(): Get product specifications and reviews
route_to_returns_agent(): Handle returns, refunds, exchanges
route_to_account_agent(): Manage user profile and settings
route_to_payment_agent(): Resolve payment and billing issues
route_to_technical_agent(): Fix app, website, login issues
<end_function_declaration>"""

# Route a query
query = "Where is my order?"

prompt = f"""<start_of_turn>user
{agent_declarations}

User query: {query}<end_of_turn>
<start_of_turn>model
"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=30,
        do_sample=False,
        pad_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=False)
print(response)
# Output: <function_call>route_to_order_agent</function_call>
```

### Production Deployment (4-bit Quantization)

```python
from transformers import AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

# 4-bit quantization config
quant_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

# Load with quantization
base_model = AutoModelForCausalLM.from_pretrained(
    "google/functiongemma-270m-it",
    quantization_config=quant_config,
    device_map="auto"
)

model = PeftModel.from_pretrained(
    base_model,
    "scionoftech/functiongemma-270m-ecommerce-router"
)

# Result: 180 MB model, 132ms latency, 89.1% accuracy
```

### Parsing Function Calls

```python
import re

def extract_agent_function(response: str) -> str:
    """Extract function name from FunctionGemma output."""
    match = re.search(r'<function_call>([a-zA-Z_]+)</function_call>', response)
    return match.group(1) if match else "unknown"

# Usage
agent = extract_agent_function(response)
print(f"Route to: {agent}")
# Output: Route to: route_to_order_agent
```

## Training Procedure

### Dataset Preparation

Generated 12,550 synthetic examples with linguistic variations:

```python
# Example training format
{
    "query": "Please track my package ASAP",
    "function": "route_to_order_agent",
    "agent": "order_management"
}
```

Variations included:
- Polite forms: "Please", "Could you", "Can you"
- Casual starters: "Hey", "Hi", "Um"
- Urgency markers: "ASAP", "urgently", "immediately"
- Edge cases and ambiguous queries

### Training Configuration

```python
from transformers import TrainingArguments
from trl import SFTTrainer
from peft import LoraConfig

# LoRA config
lora_config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM"
)

# Training args
training_args = TrainingArguments(
    output_dir="./functiongemma-ecommerce-router",
    num_train_epochs=3,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=4,
    learning_rate=2e-4,
    lr_scheduler_type="cosine",
    warmup_ratio=0.1,
    weight_decay=0.01,
    bf16=True,
    optim="paged_adamw_8bit",
    logging_steps=20,
    eval_strategy="epoch",
    save_strategy="epoch"
)
```

### Training Results

- **Final Training Loss:** 0.0182
- **Final Validation Loss:** 0.0198
- **Training Time:** 45 minutes (T4 GPU)
- **Peak Memory:** 11.2 GB

## Limitations and Biases

### Known Limitations

1. **Ambiguous Queries:** 10.6% error rate concentrated in genuinely ambiguous queries
   - Example: "I need help" (could be any agent)
   - Mitigation: Implement confidence-based clarification (confidence < 0.7)

2. **Context Dependency:** Requires conversation state management for multi-turn interactions
   - Solution: Use durable workflow orchestrators (Temporal, Cadence)

3. **Agent Confusion:** Most common misclassifications:
   - Returns ↔ Order Management (12 cases)
   - Account ↔ Payment (8 cases)
   - Technical ↔ Product Details (6 cases)

4. **Language:** Trained only on English queries
   - For multilingual support, fine-tune on translated datasets

### Biases

- **Domain-Specific:** Trained exclusively on e-commerce customer support
- **Synthetic Data:** Generated examples may not capture all real-world variations
- **Agent Distribution:** Balanced training may not reflect real query distributions

## Ethical Considerations

- **Misrouting Impact:** Incorrect routing may frustrate customers or delay issue resolution
- **Recommendation:** Implement fallback to human agents for low-confidence predictions
- **Privacy:** Model doesn't store user data; conversation state managed externally
- **Fairness:** Ensure equal routing performance across user demographics

## Citation

If you use this model in your research or production systems, please cite:

```bibtex
@misc{functiongemma-ecommerce-router,
  author = {Sai Kumar Yava},
  title = {FunctionGemma 270M Fine-tuned for E-Commerce Multi-Agent Routing},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/scionoftech/functiongemma-270m-ecommerce-router}},
}

@article{functiongemma2025,
  title={FunctionGemma: Bringing bespoke function calling to the edge},
  author={Google DeepMind},
  year={2025},
  url={https://blog.google/technology/developers/functiongemma/}
}
```

## Acknowledgments

- Google DeepMind for FunctionGemma base model
- Hugging Face for PEFT and Transformers libraries
- The open-source AI community

## License

This model inherits the Gemma license from the base model. See [Gemma Terms of Use](https://ai.google.dev/gemma/terms).

**Commercial Use:** Permitted under Gemma license terms.

## Related Resources

- **Training Notebook:** [Google Colab](https://colab.research.google.com/github/scionoftech/functiongemma-finetuning-e-commerce/blob/main/FunctionGemma_fine_tuning.ipynb)
- **GitHub Repository:** [Complete code](https://github.com/scionoftech/functiongemma-finetuning-e-commerce)
- **Dataset:** [Training data](https://huggingface.co/datasets/scionoftech/functiongemma-e-commerce-dataset)
- **Base Model:** [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it)

## Updates

- **2025-12-25:** Initial release - 89.4% routing accuracy on e-commerce customer support

---

**Questions?** Open an issue on [GitHub](https://github.com/scionoftech/functiongemma-finetuning-e-commerce/issues)