File size: 8,496 Bytes
c29ec9f 2f47925 c29ec9f 2f47925 c29ec9f 2f47925 c29ec9f 2f47925 c29ec9f 2f47925 c29ec9f 2f47925 c29ec9f 2f47925 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 |
---
license: gemma
library_name: transformers
tags:
- function-calling
- tool-use
- mobile
- gemma
- unsloth
- fine-tuned
base_model: google/gemma-3-1b-it
datasets:
- google/mobile-actions
pipeline_tag: text-generation
language:
- en
---
# FunctionGemma Mobile Actions v5
A fine-tuned version of [FunctionGemma 270M](https://huggingface.co/google/gemma-3-1b-it) optimized for mobile device function calling. This model excels at understanding natural language commands and mapping them to structured function calls for common mobile actions.
## Model Description
- **Base Model:** google/gemma-3-1b-it (270M parameters)
- **Fine-tuning Method:** LoRA (r=128, alpha=128)
- **Training Data:** [google/mobile-actions](https://huggingface.co/datasets/google/mobile-actions) + synthetic augmentation
- **Optimized For:** Mobile assistant function calling
## Supported Functions
| Function | Description | Example Input |
|----------|-------------|---------------|
| `set_alarm` | Set alarms | "Wake me up at 7am" |
| `create_reminder` | Create reminders | "Remind me to buy milk" |
| `set_timer` | Set countdown timers | "Timer for 10 minutes" |
| `make_call` | Make phone calls | "Call Mom" |
| `send_message` | Send text messages | "Text John I'm running late" |
| `create_calendar_event` | Schedule events | "Schedule meeting at 3pm" |
| `play_music` | Play music | "Play some jazz" |
| `get_weather` | Get weather info | "What's the weather like?" |
| `open_app` | Open applications | "Open the camera" |
| `navigate` | Get directions | "Navigate to the airport" |
| `set_volume` | Adjust volume | "Turn the volume up" |
| `calculator` | Math calculations | "What's 15 times 23?" |
## Usage with vLLM
### Installation
```bash
pip install vllm
```
### Basic Inference
```python
from vllm import LLM, SamplingParams
from datetime import datetime
# Load model
llm = LLM(
model="essobi/functiongemma-mobile-actions-v5-16bit",
trust_remote_code=True,
max_model_len=4096,
)
# Define available tools
tools = [
{
"function": {
"name": "set_alarm",
"description": "Sets an alarm for a specific time.",
"parameters": {
"type": "OBJECT",
"properties": {
"datetime": {"type": "STRING", "description": "The time for the alarm."},
"title": {"type": "STRING", "description": "Optional label for the alarm."},
},
"required": ["datetime"]
}
}
},
{
"function": {
"name": "create_reminder",
"description": "Creates a reminder with text and optional time.",
"parameters": {
"type": "OBJECT",
"properties": {
"body": {"type": "STRING", "description": "The reminder text."},
"datetime": {"type": "STRING", "description": "When to remind."},
},
"required": ["body"]
}
}
},
{
"function": {
"name": "send_message",
"description": "Sends a text message to a contact.",
"parameters": {
"type": "OBJECT",
"properties": {
"to": {"type": "STRING", "description": "Contact name or phone number."},
"body": {"type": "STRING", "description": "Message content."},
},
"required": ["to", "body"]
}
}
},
# Add more tools as needed...
]
# Build prompt using the training format
def build_prompt(user_input: str, tools: list) -> str:
now = datetime.now()
dt_str = now.strftime("%Y-%m-%dT%H:%M:%S")
day = now.strftime("%A")
# Build function declarations
func_decls = ""
for tool in tools:
func = tool["function"]
props = func["parameters"].get("properties", {})
required = func["parameters"].get("required", [])
props_str = ""
for pname, pinfo in props.items():
desc = pinfo.get("description", "")
ptype = pinfo.get("type", "STRING")
props_str += f"{pname}:{{description:<escape>{desc}<escape>,type:<escape>{ptype}<escape>}},"
props_str = props_str.rstrip(",")
req_str = ",".join([f"<escape>{r}<escape>" for r in required])
func_decls += f"<start_function_declaration>declaration:{func['name']}{{description:<escape>{func['description']}<escape>,parameters:{{properties:{{{props_str}}},required:[{req_str}],type:<escape>OBJECT<escape>}}}}<end_function_declaration>"
return f"""<start_of_turn>developer
Current date and time given in YYYY-MM-DDTHH:MM:SS format: {dt_str}
Day of week is {day}
You are a model that can do function calling with the following functions{func_decls}<end_of_turn>
<start_of_turn>user
{user_input}<end_of_turn>
<start_of_turn>model
"""
# Generate
prompt = build_prompt("Set an alarm for 7am tomorrow", tools)
sampling_params = SamplingParams(temperature=0.1, max_tokens=150)
outputs = llm.generate([prompt], sampling_params)
print(outputs[0].outputs[0].text)
# Output: <start_function_call>call:set_alarm{datetime:<escape>7am tomorrow<escape>}<end_function_call>
```
### vLLM OpenAI-Compatible Server
```bash
# Start the server
python -m vllm.entrypoints.openai.api_server \
--model essobi/functiongemma-mobile-actions-v5-16bit \
--port 8000
```
```python
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key="dummy")
response = client.chat.completions.create(
model="essobi/functiongemma-mobile-actions-v5-16bit",
messages=[
{"role": "user", "content": "Remind me to call the dentist tomorrow"}
],
max_tokens=150,
temperature=0.1,
)
print(response.choices[0].message.content)
```
## Usage with Transformers
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "essobi/functiongemma-mobile-actions-v5-16bit"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float16,
device_map="auto",
)
# Use the same prompt building function as above
prompt = build_prompt("What's the weather like?", tools)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=150, temperature=0.1, do_sample=True)
response = tokenizer.decode(outputs[0], skip_special_tokens=False)
print(response)
```
## Output Format
The model outputs function calls in this format:
```
<start_function_call>call:function_name{param1:<escape>value1<escape>,param2:<escape>value2<escape>}<end_function_call>
```
### Parsing Function Calls
```python
import re
def parse_function_call(text: str) -> dict | None:
"""Parse function call from model output."""
match = re.search(
r'<start_function_call>call:(\w+)\{([^}]*)\}<end_function_call>',
text
)
if not match:
return None
func_name = match.group(1)
args_str = match.group(2)
# Parse arguments
args = {}
for param_match in re.finditer(r'(\w+):<escape>([^<]*)<escape>', args_str):
args[param_match.group(1)] = param_match.group(2)
return {"name": func_name, "arguments": args}
# Example
output = "<start_function_call>call:set_alarm{datetime:<escape>7am<escape>,title:<escape>Wake up<escape>}<end_function_call>"
parsed = parse_function_call(output)
print(parsed)
# {'name': 'set_alarm', 'arguments': {'datetime': '7am', 'title': 'Wake up'}}
```
## Training Details
- **Hardware:** 8x Tesla V100-SXM2-32GB
- **Training Time:** ~48 minutes
- **Epochs:** 3
- **Batch Size:** 64 effective (4 per device × 2 grad accum × 8 GPUs)
- **Learning Rate:** 1e-5 with linear schedule
- **Gradient Clipping:** max_grad_norm=1.0
## Limitations
- Optimized for English only
- Best for single-turn function calling (not multi-turn conversations)
- May struggle with highly ambiguous requests
- Calendar vs Reminder distinction can be tricky for edge cases
## License
This model is released under the [Gemma License](https://ai.google.dev/gemma/terms).
## Acknowledgments
- Google for the [Gemma](https://ai.google.dev/gemma) model family and [mobile-actions](https://huggingface.co/datasets/google/mobile-actions) dataset
- [Unsloth](https://github.com/unslothai/unsloth) for efficient fine-tuning tools
|