|
|
--- |
|
|
license: gemma |
|
|
library_name: transformers |
|
|
tags: |
|
|
- function-calling |
|
|
- tool-use |
|
|
- mobile |
|
|
- gemma |
|
|
- unsloth |
|
|
- fine-tuned |
|
|
base_model: google/gemma-3-1b-it |
|
|
datasets: |
|
|
- google/mobile-actions |
|
|
pipeline_tag: text-generation |
|
|
language: |
|
|
- en |
|
|
--- |
|
|
|
|
|
# FunctionGemma Mobile Actions v5 |
|
|
|
|
|
A fine-tuned version of [FunctionGemma 270M](https://huggingface.co/google/gemma-3-1b-it) optimized for mobile device function calling. This model excels at understanding natural language commands and mapping them to structured function calls for common mobile actions. |
|
|
|
|
|
## Model Description |
|
|
|
|
|
- **Base Model:** google/gemma-3-1b-it (270M parameters) |
|
|
- **Fine-tuning Method:** LoRA (r=128, alpha=128) |
|
|
- **Training Data:** [google/mobile-actions](https://huggingface.co/datasets/google/mobile-actions) + synthetic augmentation |
|
|
- **Optimized For:** Mobile assistant function calling |
|
|
|
|
|
## Supported Functions |
|
|
|
|
|
| Function | Description | Example Input | |
|
|
|----------|-------------|---------------| |
|
|
| `set_alarm` | Set alarms | "Wake me up at 7am" | |
|
|
| `create_reminder` | Create reminders | "Remind me to buy milk" | |
|
|
| `set_timer` | Set countdown timers | "Timer for 10 minutes" | |
|
|
| `make_call` | Make phone calls | "Call Mom" | |
|
|
| `send_message` | Send text messages | "Text John I'm running late" | |
|
|
| `create_calendar_event` | Schedule events | "Schedule meeting at 3pm" | |
|
|
| `play_music` | Play music | "Play some jazz" | |
|
|
| `get_weather` | Get weather info | "What's the weather like?" | |
|
|
| `open_app` | Open applications | "Open the camera" | |
|
|
| `navigate` | Get directions | "Navigate to the airport" | |
|
|
| `set_volume` | Adjust volume | "Turn the volume up" | |
|
|
| `calculator` | Math calculations | "What's 15 times 23?" | |
|
|
|
|
|
## Usage with vLLM |
|
|
|
|
|
### Installation |
|
|
|
|
|
```bash |
|
|
pip install vllm |
|
|
``` |
|
|
|
|
|
### Basic Inference |
|
|
|
|
|
```python |
|
|
from vllm import LLM, SamplingParams |
|
|
from datetime import datetime |
|
|
|
|
|
# Load model |
|
|
llm = LLM( |
|
|
model="essobi/functiongemma-mobile-actions-v5-16bit", |
|
|
trust_remote_code=True, |
|
|
max_model_len=4096, |
|
|
) |
|
|
|
|
|
# Define available tools |
|
|
tools = [ |
|
|
{ |
|
|
"function": { |
|
|
"name": "set_alarm", |
|
|
"description": "Sets an alarm for a specific time.", |
|
|
"parameters": { |
|
|
"type": "OBJECT", |
|
|
"properties": { |
|
|
"datetime": {"type": "STRING", "description": "The time for the alarm."}, |
|
|
"title": {"type": "STRING", "description": "Optional label for the alarm."}, |
|
|
}, |
|
|
"required": ["datetime"] |
|
|
} |
|
|
} |
|
|
}, |
|
|
{ |
|
|
"function": { |
|
|
"name": "create_reminder", |
|
|
"description": "Creates a reminder with text and optional time.", |
|
|
"parameters": { |
|
|
"type": "OBJECT", |
|
|
"properties": { |
|
|
"body": {"type": "STRING", "description": "The reminder text."}, |
|
|
"datetime": {"type": "STRING", "description": "When to remind."}, |
|
|
}, |
|
|
"required": ["body"] |
|
|
} |
|
|
} |
|
|
}, |
|
|
{ |
|
|
"function": { |
|
|
"name": "send_message", |
|
|
"description": "Sends a text message to a contact.", |
|
|
"parameters": { |
|
|
"type": "OBJECT", |
|
|
"properties": { |
|
|
"to": {"type": "STRING", "description": "Contact name or phone number."}, |
|
|
"body": {"type": "STRING", "description": "Message content."}, |
|
|
}, |
|
|
"required": ["to", "body"] |
|
|
} |
|
|
} |
|
|
}, |
|
|
# Add more tools as needed... |
|
|
] |
|
|
|
|
|
# Build prompt using the training format |
|
|
def build_prompt(user_input: str, tools: list) -> str: |
|
|
now = datetime.now() |
|
|
dt_str = now.strftime("%Y-%m-%dT%H:%M:%S") |
|
|
day = now.strftime("%A") |
|
|
|
|
|
# Build function declarations |
|
|
func_decls = "" |
|
|
for tool in tools: |
|
|
func = tool["function"] |
|
|
props = func["parameters"].get("properties", {}) |
|
|
required = func["parameters"].get("required", []) |
|
|
|
|
|
props_str = "" |
|
|
for pname, pinfo in props.items(): |
|
|
desc = pinfo.get("description", "") |
|
|
ptype = pinfo.get("type", "STRING") |
|
|
props_str += f"{pname}:{{description:<escape>{desc}<escape>,type:<escape>{ptype}<escape>}}," |
|
|
props_str = props_str.rstrip(",") |
|
|
|
|
|
req_str = ",".join([f"<escape>{r}<escape>" for r in required]) |
|
|
|
|
|
func_decls += f"<start_function_declaration>declaration:{func['name']}{{description:<escape>{func['description']}<escape>,parameters:{{properties:{{{props_str}}},required:[{req_str}],type:<escape>OBJECT<escape>}}}}<end_function_declaration>" |
|
|
|
|
|
return f"""<start_of_turn>developer |
|
|
Current date and time given in YYYY-MM-DDTHH:MM:SS format: {dt_str} |
|
|
Day of week is {day} |
|
|
You are a model that can do function calling with the following functions{func_decls}<end_of_turn> |
|
|
<start_of_turn>user |
|
|
{user_input}<end_of_turn> |
|
|
<start_of_turn>model |
|
|
""" |
|
|
|
|
|
# Generate |
|
|
prompt = build_prompt("Set an alarm for 7am tomorrow", tools) |
|
|
sampling_params = SamplingParams(temperature=0.1, max_tokens=150) |
|
|
outputs = llm.generate([prompt], sampling_params) |
|
|
|
|
|
print(outputs[0].outputs[0].text) |
|
|
# Output: <start_function_call>call:set_alarm{datetime:<escape>7am tomorrow<escape>}<end_function_call> |
|
|
``` |
|
|
|
|
|
### vLLM OpenAI-Compatible Server |
|
|
|
|
|
```bash |
|
|
# Start the server |
|
|
python -m vllm.entrypoints.openai.api_server \ |
|
|
--model essobi/functiongemma-mobile-actions-v5-16bit \ |
|
|
--port 8000 |
|
|
``` |
|
|
|
|
|
```python |
|
|
from openai import OpenAI |
|
|
|
|
|
client = OpenAI(base_url="http://localhost:8000/v1", api_key="dummy") |
|
|
|
|
|
response = client.chat.completions.create( |
|
|
model="essobi/functiongemma-mobile-actions-v5-16bit", |
|
|
messages=[ |
|
|
{"role": "user", "content": "Remind me to call the dentist tomorrow"} |
|
|
], |
|
|
max_tokens=150, |
|
|
temperature=0.1, |
|
|
) |
|
|
|
|
|
print(response.choices[0].message.content) |
|
|
``` |
|
|
|
|
|
## Usage with Transformers |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
import torch |
|
|
|
|
|
model_id = "essobi/functiongemma-mobile-actions-v5-16bit" |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
model_id, |
|
|
torch_dtype=torch.float16, |
|
|
device_map="auto", |
|
|
) |
|
|
|
|
|
# Use the same prompt building function as above |
|
|
prompt = build_prompt("What's the weather like?", tools) |
|
|
|
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate(**inputs, max_new_tokens=150, temperature=0.1, do_sample=True) |
|
|
response = tokenizer.decode(outputs[0], skip_special_tokens=False) |
|
|
|
|
|
print(response) |
|
|
``` |
|
|
|
|
|
## Output Format |
|
|
|
|
|
The model outputs function calls in this format: |
|
|
|
|
|
``` |
|
|
<start_function_call>call:function_name{param1:<escape>value1<escape>,param2:<escape>value2<escape>}<end_function_call> |
|
|
``` |
|
|
|
|
|
### Parsing Function Calls |
|
|
|
|
|
```python |
|
|
import re |
|
|
|
|
|
def parse_function_call(text: str) -> dict | None: |
|
|
"""Parse function call from model output.""" |
|
|
match = re.search( |
|
|
r'<start_function_call>call:(\w+)\{([^}]*)\}<end_function_call>', |
|
|
text |
|
|
) |
|
|
if not match: |
|
|
return None |
|
|
|
|
|
func_name = match.group(1) |
|
|
args_str = match.group(2) |
|
|
|
|
|
# Parse arguments |
|
|
args = {} |
|
|
for param_match in re.finditer(r'(\w+):<escape>([^<]*)<escape>', args_str): |
|
|
args[param_match.group(1)] = param_match.group(2) |
|
|
|
|
|
return {"name": func_name, "arguments": args} |
|
|
|
|
|
# Example |
|
|
output = "<start_function_call>call:set_alarm{datetime:<escape>7am<escape>,title:<escape>Wake up<escape>}<end_function_call>" |
|
|
parsed = parse_function_call(output) |
|
|
print(parsed) |
|
|
# {'name': 'set_alarm', 'arguments': {'datetime': '7am', 'title': 'Wake up'}} |
|
|
``` |
|
|
|
|
|
## Training Details |
|
|
|
|
|
- **Hardware:** 8x Tesla V100-SXM2-32GB |
|
|
- **Training Time:** ~48 minutes |
|
|
- **Epochs:** 3 |
|
|
- **Batch Size:** 64 effective (4 per device × 2 grad accum × 8 GPUs) |
|
|
- **Learning Rate:** 1e-5 with linear schedule |
|
|
- **Gradient Clipping:** max_grad_norm=1.0 |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Optimized for English only |
|
|
- Best for single-turn function calling (not multi-turn conversations) |
|
|
- May struggle with highly ambiguous requests |
|
|
- Calendar vs Reminder distinction can be tricky for edge cases |
|
|
|
|
|
## License |
|
|
|
|
|
This model is released under the [Gemma License](https://ai.google.dev/gemma/terms). |
|
|
|
|
|
## Acknowledgments |
|
|
|
|
|
- Google for the [Gemma](https://ai.google.dev/gemma) model family and [mobile-actions](https://huggingface.co/datasets/google/mobile-actions) dataset |
|
|
- [Unsloth](https://github.com/unslothai/unsloth) for efficient fine-tuning tools |
|
|
|