--- language: - en license: gemma library_name: transformers tags: - function-calling - agent-routing - multi-agent - lora - peft - gemma - functiongemma - customer-support - e-commerce base_model: google/functiongemma-270m-it datasets: - scionoftech/functiongemma-e-commerce-dataset model-index: - name: functiongemma-270m-ecommerce-router results: - task: type: text-classification name: Agent Routing dataset: name: E-commerce Customer Support Routing type: scionoftech/ecommerce-agent-routing metrics: - type: accuracy value: 89.4 name: Routing Accuracy - type: f1 value: 89.0 name: Macro F1 Score --- # FunctionGemma 270M - E-Commerce Multi-Agent Router Fine-tuned version of [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it) for intelligent routing of customer queries across 7 specialized agents in e-commerce customer support systems. ## Model Description This model demonstrates how FunctionGemma can be adapted beyond mobile actions for **multi-agent orchestration** in enterprise systems. It intelligently routes natural language customer queries to the appropriate specialized agent with **89.4% accuracy**. **Key Achievement:** Replacing brittle rule-based routing (52-58% accuracy) with learned intelligence using only 1.47M trainable parameters (0.55% of the model). ### Architecture - **Base Model:** google/functiongemma-270m-it (270M parameters) - **Fine-tuning Method:** LoRA (Low-Rank Adaptation) - **Trainable Parameters:** 1,474,560 (0.55%) - **LoRA Rank:** 16 - **LoRA Alpha:** 32 - **Target Modules:** q_proj, k_proj, v_proj, o_proj ### Training Details - **Training Data:** 12,550 synthetic customer queries (balanced across 7 agents) - **Training Time:** 45 minutes on Google Colab T4 GPU - **Framework:** Hugging Face Transformers + PEFT + TRL - **Quantization:** 4-bit NF4 during training - **Optimizer:** paged_adamw_8bit - **Learning Rate:** 2e-4 - **Epochs:** 3 - **Batch Size:** 4 (effective 16 with gradient accumulation) ## Intended Use ### Primary Use Case **Multi-agent customer support routing** for e-commerce platforms: - Route queries to order management, product search, returns, payments, account, technical support agents - Maintain conversation context across multi-turn interactions - Enable intelligent task switching ### Supported Agents The model routes queries to 7 specialized agents: 1. **Order Management** (`route_to_order_agent`) - Track orders, update delivery, cancel orders 2. **Product Search** (`route_to_search_agent`) - Search catalog, check availability, recommendations 3. **Product Details** (`route_to_details_agent`) - Specifications, reviews, comparisons 4. **Returns & Refunds** (`route_to_returns_agent`) - Initiate returns, process refunds, exchanges 5. **Account Management** (`route_to_account_agent`) - Update profile, manage addresses, security 6. **Payment Support** (`route_to_payment_agent`) - Resolve payment issues, update methods, billing 7. **Technical Support** (`route_to_technical_agent`) - Fix app/website issues, login problems ### Out-of-Scope Use - ❌ General-purpose chatbot (use base Gemma models instead) - ❌ Direct dialogue generation (this is a routing model) - ❌ More than 20 agents (context window limitations) - ❌ Non-customer-support domains without fine-tuning ## Performance ### Test Set Results ``` Overall Accuracy: 89.40% (1,684/1,883 correct) Per-Agent Performance: order_management 92.3% (251/272) product_search 91.1% (257/282) product_details 94.7% (233/246) returns_refunds 88.2% (238/270) account_management 85.1% (229/269) payment_support 89.5% (241/269) technical_support 87.0% (234/269) ``` ### Comparison to Baselines | Approach | Accuracy | Latency | Memory | |----------|----------|---------|--------| | Keyword Matching | 52-58% | 5ms | Negligible | | Rule-based (100 rules) | 65-70% | 8ms | Negligible | | BERT Classifier (300M) | 82-85% | 45ms | 400 MB | | **This Model (LoRA)** | **89.4%** | **127ms** | **2.1 GB** | | GPT-4 API (zero-shot) | 85-90% | 2500ms | Cloud | ### Latency Breakdown (T4 GPU) - **Routing Decision:** 127ms average - **Agent Execution:** ~52ms average - **Total End-to-End:** ~179ms average ## How to Use ### Installation ```bash pip install transformers peft torch accelerate bitsandbytes ``` ### Quick Start ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel import torch # Load base model base_model = AutoModelForCausalLM.from_pretrained( "google/functiongemma-270m-it", device_map="auto", torch_dtype=torch.bfloat16 ) # Load LoRA adapters model = PeftModel.from_pretrained( base_model, "scionoftech/functiongemma-270m-ecommerce-router" ) tokenizer = AutoTokenizer.from_pretrained("google/functiongemma-270m-it") # Define available agents agent_declarations = """ route_to_order_agent(): Track, update, or cancel customer orders route_to_search_agent(): Search products, check availability route_to_details_agent(): Get product specifications and reviews route_to_returns_agent(): Handle returns, refunds, exchanges route_to_account_agent(): Manage user profile and settings route_to_payment_agent(): Resolve payment and billing issues route_to_technical_agent(): Fix app, website, login issues """ # Route a query query = "Where is my order?" prompt = f"""user {agent_declarations} User query: {query} model """ inputs = tokenizer(prompt, return_tensors="pt").to(model.device) with torch.no_grad(): outputs = model.generate( **inputs, max_new_tokens=30, do_sample=False, pad_token_id=tokenizer.eos_token_id ) response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=False) print(response) # Output: route_to_order_agent ``` ### Production Deployment (4-bit Quantization) ```python from transformers import AutoModelForCausalLM, BitsAndBytesConfig from peft import PeftModel # 4-bit quantization config quant_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16 ) # Load with quantization base_model = AutoModelForCausalLM.from_pretrained( "google/functiongemma-270m-it", quantization_config=quant_config, device_map="auto" ) model = PeftModel.from_pretrained( base_model, "scionoftech/functiongemma-270m-ecommerce-router" ) # Result: 180 MB model, 132ms latency, 89.1% accuracy ``` ### Parsing Function Calls ```python import re def extract_agent_function(response: str) -> str: """Extract function name from FunctionGemma output.""" match = re.search(r'([a-zA-Z_]+)', response) return match.group(1) if match else "unknown" # Usage agent = extract_agent_function(response) print(f"Route to: {agent}") # Output: Route to: route_to_order_agent ``` ## Training Procedure ### Dataset Preparation Generated 12,550 synthetic examples with linguistic variations: ```python # Example training format { "query": "Please track my package ASAP", "function": "route_to_order_agent", "agent": "order_management" } ``` Variations included: - Polite forms: "Please", "Could you", "Can you" - Casual starters: "Hey", "Hi", "Um" - Urgency markers: "ASAP", "urgently", "immediately" - Edge cases and ambiguous queries ### Training Configuration ```python from transformers import TrainingArguments from trl import SFTTrainer from peft import LoraConfig # LoRA config lora_config = LoraConfig( r=16, lora_alpha=32, target_modules=["q_proj", "k_proj", "v_proj", "o_proj"], lora_dropout=0.05, bias="none", task_type="CAUSAL_LM" ) # Training args training_args = TrainingArguments( output_dir="./functiongemma-ecommerce-router", num_train_epochs=3, per_device_train_batch_size=4, gradient_accumulation_steps=4, learning_rate=2e-4, lr_scheduler_type="cosine", warmup_ratio=0.1, weight_decay=0.01, bf16=True, optim="paged_adamw_8bit", logging_steps=20, eval_strategy="epoch", save_strategy="epoch" ) ``` ### Training Results - **Final Training Loss:** 0.0182 - **Final Validation Loss:** 0.0198 - **Training Time:** 45 minutes (T4 GPU) - **Peak Memory:** 11.2 GB ## Limitations and Biases ### Known Limitations 1. **Ambiguous Queries:** 10.6% error rate concentrated in genuinely ambiguous queries - Example: "I need help" (could be any agent) - Mitigation: Implement confidence-based clarification (confidence < 0.7) 2. **Context Dependency:** Requires conversation state management for multi-turn interactions - Solution: Use durable workflow orchestrators (Temporal, Cadence) 3. **Agent Confusion:** Most common misclassifications: - Returns ↔ Order Management (12 cases) - Account ↔ Payment (8 cases) - Technical ↔ Product Details (6 cases) 4. **Language:** Trained only on English queries - For multilingual support, fine-tune on translated datasets ### Biases - **Domain-Specific:** Trained exclusively on e-commerce customer support - **Synthetic Data:** Generated examples may not capture all real-world variations - **Agent Distribution:** Balanced training may not reflect real query distributions ## Ethical Considerations - **Misrouting Impact:** Incorrect routing may frustrate customers or delay issue resolution - **Recommendation:** Implement fallback to human agents for low-confidence predictions - **Privacy:** Model doesn't store user data; conversation state managed externally - **Fairness:** Ensure equal routing performance across user demographics ## Citation If you use this model in your research or production systems, please cite: ```bibtex @misc{functiongemma-ecommerce-router, author = {Sai Kumar Yava}, title = {FunctionGemma 270M Fine-tuned for E-Commerce Multi-Agent Routing}, year = {2025}, publisher = {HuggingFace}, howpublished = {\url{https://huggingface.co/scionoftech/functiongemma-270m-ecommerce-router}}, } @article{functiongemma2025, title={FunctionGemma: Bringing bespoke function calling to the edge}, author={Google DeepMind}, year={2025}, url={https://blog.google/technology/developers/functiongemma/} } ``` ## Acknowledgments - Google DeepMind for FunctionGemma base model - Hugging Face for PEFT and Transformers libraries - The open-source AI community ## License This model inherits the Gemma license from the base model. See [Gemma Terms of Use](https://ai.google.dev/gemma/terms). **Commercial Use:** Permitted under Gemma license terms. ## Related Resources - **Training Notebook:** [Google Colab](https://colab.research.google.com/github/scionoftech/functiongemma-finetuning-e-commerce/blob/main/FunctionGemma_fine_tuning.ipynb) - **GitHub Repository:** [Complete code](https://github.com/scionoftech/functiongemma-finetuning-e-commerce) - **Dataset:** [Training data](https://huggingface.co/datasets/scionoftech/functiongemma-e-commerce-dataset) - **Base Model:** [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it) ## Updates - **2025-12-25:** Initial release - 89.4% routing accuracy on e-commerce customer support --- **Questions?** Open an issue on [GitHub](https://github.com/scionoftech/functiongemma-finetuning-e-commerce/issues)