File size: 9,155 Bytes
b7f34d5 6fbe6b7 b7f34d5 6fbe6b7 b7f34d5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 |
---
base_model: google/functiongemma-270m-it
tags:
- function-calling
- mobile-actions
- gemma
library_name: transformers
datasets:
- google/mobile-actions
language:
- en
license: gemma
---
# FunctionGemma 270M for Mobile Actions
This model is a fine-tuned version of [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it) specialized for mobile assistant actions. It has been trained on the [google/mobile-actions](https://huggingface.co/datasets/google/mobile-actions) dataset to perform structured function calling for common mobile device tasks.
## Model Description
**Base Model**: `google/functiongemma-270m-it` - A 270M parameter instruction-tuned model from Google's FunctionGemma family, designed for function calling tasks.
**Specialization**: Mobile assistant actions including:
- Calendar event management
- Email composition and sending
- Contact creation
- Flashlight control
- Wi-Fi settings navigation
- Map location display
**Training Objective**: The model learns to emit structured function calls in the format `call:<function_name>{arg1:value1,arg2:value2,...}` instead of natural language responses.
## Supported Functions
The model is optimized to call these mobile action functions:
1. **`turn_on_flashlight()`** - Turns the device flashlight on
2. **`turn_off_flashlight()`** - Turns the device flashlight off
3. **`create_contact(first_name, last_name, phone_number?, email?)`** - Creates a new contact
4. **`send_email(to, subject, body?)`** - Sends an email to a recipient
5. **`show_map(query)`** - Displays a location on the map by name, business, or address
6. **`open_wifi_settings()`** - Opens the Wi-Fi settings screen
7. **`create_calendar_event(title, datetime)`** - Creates a calendar event (datetime in ISO format: `YYYY-MM-DDTHH:MM:SS`)
## Training Details
### Training Data
- **Dataset**: [google/mobile-actions](https://huggingface.co/datasets/google/mobile-actions)
- **Format**: JSONL with prompt-completion pairs
- **Splits**:
- Training set: examples with `"metadata": "train"`
- Evaluation set: examples with `"metadata": "eval"`
- **Preprocessing**: Converted to TRL prompt-completion format with `completion_only_loss=True`
### Training Procedure
Fine-tuned using Hugging Face [TRL (Transformer Reinforcement Learning)](https://huggingface.co/docs/trl) with the `SFTTrainer`.
**Training Configuration**:
- **Epochs**: 4
- **Batch size**: 8 per device
- **Gradient accumulation steps**: 4
- **Learning rate**: 5e-5
- **Scheduler**: Cosine
- **Max sequence length**: 997 tokens (based on longest example: 897 tokens)
- **Optimizer**: AdamW (fused)
- **Precision**: bfloat16
- **Gradient checkpointing**: Enabled
- **Completion only loss**: True (trains only on model outputs, not prompts)
**Training Infrastructure**:
- **Hardware**: Google Colab A100 GPU
- **Training time**: ~~60 minutes for 4 epochs
- **Library versions**: transformers==4.57.1, trl==0.25.1, datasets==4.4.1
### Training Results
Final metrics after 4 epochs:
| Step | Training Loss | Validation Loss | Mean Token Accuracy |
|------|---------------|-----------------|---------------------|
| 500 | 0.008800 | 0.013452 | 0.996691 |
The model achieved 99.67% token-level accuracy on the validation set, showing significant improvement over the base model's mobile action capabilities.
## Intended Use
This model is designed for:
- **Mobile AI assistants** that need to execute device actions based on user requests
- **Voice-controlled mobile applications**
- **Conversational agents** that interact with mobile device features
- **On-device AI** applications (can be converted to `.litertlm` format for deployment)
## How to Use
### Basic Inference
```python
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import json
# Load model and tokenizer
model_id = "jprtr/google_mobile_actions"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
attn_implementation="eager",
torch_dtype="auto",
)
# Create pipeline
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
# Define the tools (function schemas)
tools = [
{
"function": {
"name": "create_calendar_event",
"description": "Creates a new calendar event.",
"parameters": {
"type": "OBJECT",
"properties": {
"title": {"type": "STRING", "description": "The title of the event."},
"datetime": {"type": "STRING", "description": "The date and time in YYYY-MM-DDTHH:MM:SS format."},
},
"required": ["title", "datetime"],
},
}
},
{
"function": {
"name": "send_email",
"description": "Sends an email.",
"parameters": {
"type": "OBJECT",
"properties": {
"to": {"type": "STRING", "description": "The recipient email address."},
"subject": {"type": "STRING", "description": "The email subject."},
"body": {"type": "STRING", "description": "The email body."},
},
"required": ["to", "subject"],
},
}
},
# ... add other function definitions
]
# Create messages
messages = [
{
"role": "developer",
"content": (
"Current date and time given in YYYY-MM-DDTHH:MM:SS format: 2025-07-10T19:06:29\n"
"Day of week is Thursday\n"
"You are a model that can do function calling with the following functions\n"
),
},
{
"role": "user",
"content": 'Schedule a "team meeting" tomorrow at 4pm.',
},
]
# Apply chat template
prompt = tokenizer.apply_chat_template(
messages,
tools=tools,
tokenize=False,
add_generation_prompt=True,
)
# Generate
output = pipe(prompt, max_new_tokens=200)[0]["generated_text"][len(prompt):].strip()
print("Model output:", output)
# Example output: call:create_calendar_event{datetime:2025-07-11T16:00:00,title:team meeting}
```
### Parsing Function Calls
The model outputs function calls in a simple format:
```
call:<function_name>{arg1:value1,arg2:value2,...}
```
For multiple function calls, they appear sequentially:
```
call:create_calendar_event{datetime:2025-07-15T10:30:00,title:Dental Checkup}
call:send_email{to:user@example.com,subject:Appointment,body:See you there!}
```
You can parse these by:
1. Splitting on `call:` to identify individual function calls
2. Extracting the function name (text before `{`)
3. Parsing the arguments block (content within `{}`)
## Evaluation
The model was evaluated on the held-out test set from the mobile-actions dataset. Evaluation metrics compare exact string matching of the model's function call outputs against ground truth labels.
**Key Observations**:
- The base FunctionGemma 270M model often fails to call appropriate functions for mobile actions
- After fine-tuning, the model reliably generates correct function calls with proper argument formatting
- Token-level accuracy on the validation set: **99.67%**
## Limitations
- The model is specialized for the 7 mobile action functions listed above and may not generalize well to other function calling tasks
- Date/time parsing relies on context provided in the developer message (current date/time must be specified)
- The model outputs may occasionally include variations in argument formatting that are semantically correct but don't exactly match the expected format
- This is a 270M parameter model, so while efficient for mobile deployment, it may have lower accuracy than larger models
## On-Device Deployment
The model can be converted to `.litertlm` format for on-device deployment using `ai-edge-torch`. See the [training notebook](https://colab.research.google.com/github/google-gemini/gemma-cookbook/blob/main/FunctionGemma/%5BFunctionGemma%5DFinetune_FunctionGemma_270M_for_Mobile_Actions_with_Hugging_Face.ipynb) for conversion instructions.
The converted model can be deployed on:
- Android devices via [Google AI Edge](https://ai.google.dev/edge)
- [AI Edge Gallery app](https://play.google.com/store/apps/details?id=com.google.ai.edge.gallery)
## Training Notebook
For full training details, hyperparameter tuning, and evaluation, see the original Colab notebook:
[Finetune FunctionGemma 270M for Mobile Actions](https://colab.research.google.com/github/google-gemini/gemma-cookbook/blob/main/FunctionGemma/%5BFunctionGemma%5DFinetune_FunctionGemma_270M_for_Mobile_Actions_with_Hugging_Face.ipynb)
## Citation
If you use this model, please cite the original FunctionGemma paper and the Google Mobile Actions dataset:
```bibtex
@misc{functiongemma2024,
title={FunctionGemma: Function Calling for Gemma Models},
author={Google},
year={2024},
url={https://huggingface.co/google/functiongemma-270m-it}
}
```
## License
This model is released under the Gemma license. See the [Gemma Terms of Use](https://ai.google.dev/gemma/terms) for details. |