File size: 8,496 Bytes
c29ec9f
2f47925
 
c29ec9f
2f47925
 
 
 
 
 
 
 
 
 
c29ec9f
2f47925
c29ec9f
 
2f47925
c29ec9f
2f47925
c29ec9f
2f47925
c29ec9f
2f47925
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
---
license: gemma
library_name: transformers
tags:
  - function-calling
  - tool-use
  - mobile
  - gemma
  - unsloth
  - fine-tuned
base_model: google/gemma-3-1b-it
datasets:
  - google/mobile-actions
pipeline_tag: text-generation
language:
  - en
---

# FunctionGemma Mobile Actions v5

A fine-tuned version of [FunctionGemma 270M](https://huggingface.co/google/gemma-3-1b-it) optimized for mobile device function calling. This model excels at understanding natural language commands and mapping them to structured function calls for common mobile actions.

## Model Description

- **Base Model:** google/gemma-3-1b-it (270M parameters)
- **Fine-tuning Method:** LoRA (r=128, alpha=128)
- **Training Data:** [google/mobile-actions](https://huggingface.co/datasets/google/mobile-actions) + synthetic augmentation
- **Optimized For:** Mobile assistant function calling

## Supported Functions

| Function | Description | Example Input |
|----------|-------------|---------------|
| `set_alarm` | Set alarms | "Wake me up at 7am" |
| `create_reminder` | Create reminders | "Remind me to buy milk" |
| `set_timer` | Set countdown timers | "Timer for 10 minutes" |
| `make_call` | Make phone calls | "Call Mom" |
| `send_message` | Send text messages | "Text John I'm running late" |
| `create_calendar_event` | Schedule events | "Schedule meeting at 3pm" |
| `play_music` | Play music | "Play some jazz" |
| `get_weather` | Get weather info | "What's the weather like?" |
| `open_app` | Open applications | "Open the camera" |
| `navigate` | Get directions | "Navigate to the airport" |
| `set_volume` | Adjust volume | "Turn the volume up" |
| `calculator` | Math calculations | "What's 15 times 23?" |

## Usage with vLLM

### Installation

```bash
pip install vllm
```

### Basic Inference

```python
from vllm import LLM, SamplingParams
from datetime import datetime

# Load model
llm = LLM(
    model="essobi/functiongemma-mobile-actions-v5-16bit",
    trust_remote_code=True,
    max_model_len=4096,
)

# Define available tools
tools = [
    {
        "function": {
            "name": "set_alarm",
            "description": "Sets an alarm for a specific time.",
            "parameters": {
                "type": "OBJECT",
                "properties": {
                    "datetime": {"type": "STRING", "description": "The time for the alarm."},
                    "title": {"type": "STRING", "description": "Optional label for the alarm."},
                },
                "required": ["datetime"]
            }
        }
    },
    {
        "function": {
            "name": "create_reminder",
            "description": "Creates a reminder with text and optional time.",
            "parameters": {
                "type": "OBJECT",
                "properties": {
                    "body": {"type": "STRING", "description": "The reminder text."},
                    "datetime": {"type": "STRING", "description": "When to remind."},
                },
                "required": ["body"]
            }
        }
    },
    {
        "function": {
            "name": "send_message",
            "description": "Sends a text message to a contact.",
            "parameters": {
                "type": "OBJECT",
                "properties": {
                    "to": {"type": "STRING", "description": "Contact name or phone number."},
                    "body": {"type": "STRING", "description": "Message content."},
                },
                "required": ["to", "body"]
            }
        }
    },
    # Add more tools as needed...
]

# Build prompt using the training format
def build_prompt(user_input: str, tools: list) -> str:
    now = datetime.now()
    dt_str = now.strftime("%Y-%m-%dT%H:%M:%S")
    day = now.strftime("%A")
    
    # Build function declarations
    func_decls = ""
    for tool in tools:
        func = tool["function"]
        props = func["parameters"].get("properties", {})
        required = func["parameters"].get("required", [])
        
        props_str = ""
        for pname, pinfo in props.items():
            desc = pinfo.get("description", "")
            ptype = pinfo.get("type", "STRING")
            props_str += f"{pname}:{{description:<escape>{desc}<escape>,type:<escape>{ptype}<escape>}},"
        props_str = props_str.rstrip(",")
        
        req_str = ",".join([f"<escape>{r}<escape>" for r in required])
        
        func_decls += f"<start_function_declaration>declaration:{func['name']}{{description:<escape>{func['description']}<escape>,parameters:{{properties:{{{props_str}}},required:[{req_str}],type:<escape>OBJECT<escape>}}}}<end_function_declaration>"
    
    return f"""<start_of_turn>developer
Current date and time given in YYYY-MM-DDTHH:MM:SS format: {dt_str}
Day of week is {day}
You are a model that can do function calling with the following functions{func_decls}<end_of_turn>
<start_of_turn>user
{user_input}<end_of_turn>
<start_of_turn>model
"""

# Generate
prompt = build_prompt("Set an alarm for 7am tomorrow", tools)
sampling_params = SamplingParams(temperature=0.1, max_tokens=150)
outputs = llm.generate([prompt], sampling_params)

print(outputs[0].outputs[0].text)
# Output: <start_function_call>call:set_alarm{datetime:<escape>7am tomorrow<escape>}<end_function_call>
```

### vLLM OpenAI-Compatible Server

```bash
# Start the server
python -m vllm.entrypoints.openai.api_server \
    --model essobi/functiongemma-mobile-actions-v5-16bit \
    --port 8000
```

```python
from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="dummy")

response = client.chat.completions.create(
    model="essobi/functiongemma-mobile-actions-v5-16bit",
    messages=[
        {"role": "user", "content": "Remind me to call the dentist tomorrow"}
    ],
    max_tokens=150,
    temperature=0.1,
)

print(response.choices[0].message.content)
```

## Usage with Transformers

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "essobi/functiongemma-mobile-actions-v5-16bit"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto",
)

# Use the same prompt building function as above
prompt = build_prompt("What's the weather like?", tools)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=150, temperature=0.1, do_sample=True)
response = tokenizer.decode(outputs[0], skip_special_tokens=False)

print(response)
```

## Output Format

The model outputs function calls in this format:

```
<start_function_call>call:function_name{param1:<escape>value1<escape>,param2:<escape>value2<escape>}<end_function_call>
```

### Parsing Function Calls

```python
import re

def parse_function_call(text: str) -> dict | None:
    """Parse function call from model output."""
    match = re.search(
        r'<start_function_call>call:(\w+)\{([^}]*)\}<end_function_call>', 
        text
    )
    if not match:
        return None
    
    func_name = match.group(1)
    args_str = match.group(2)
    
    # Parse arguments
    args = {}
    for param_match in re.finditer(r'(\w+):<escape>([^<]*)<escape>', args_str):
        args[param_match.group(1)] = param_match.group(2)
    
    return {"name": func_name, "arguments": args}

# Example
output = "<start_function_call>call:set_alarm{datetime:<escape>7am<escape>,title:<escape>Wake up<escape>}<end_function_call>"
parsed = parse_function_call(output)
print(parsed)
# {'name': 'set_alarm', 'arguments': {'datetime': '7am', 'title': 'Wake up'}}
```

## Training Details

- **Hardware:** 8x Tesla V100-SXM2-32GB
- **Training Time:** ~48 minutes
- **Epochs:** 3
- **Batch Size:** 64 effective (4 per device × 2 grad accum × 8 GPUs)
- **Learning Rate:** 1e-5 with linear schedule
- **Gradient Clipping:** max_grad_norm=1.0

## Limitations

- Optimized for English only
- Best for single-turn function calling (not multi-turn conversations)
- May struggle with highly ambiguous requests
- Calendar vs Reminder distinction can be tricky for edge cases

## License

This model is released under the [Gemma License](https://ai.google.dev/gemma/terms).

## Acknowledgments

- Google for the [Gemma](https://ai.google.dev/gemma) model family and [mobile-actions](https://huggingface.co/datasets/google/mobile-actions) dataset
- [Unsloth](https://github.com/unslothai/unsloth) for efficient fine-tuning tools