Jay Poirtier commited on
Commit
c525043
·
verified ·
1 Parent(s): 0c8040c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +235 -1
README.md CHANGED
@@ -4,5 +4,239 @@ tags:
4
  - function-calling
5
  - mobile-actions
6
  - gemma
 
 
 
 
 
 
7
  ---
8
- A fine-tuned model based on `google/functiongemma-270m-it`.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  - function-calling
5
  - mobile-actions
6
  - gemma
7
+ library_name: transformers
8
+ datasets:
9
+ - google/mobile-actions
10
+ language:
11
+ - en
12
+ license: gemma
13
  ---
14
+
15
+ # FunctionGemma 270M for Mobile Actions
16
+
17
+ This model is a fine-tuned version of [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it) specialized for mobile assistant actions. It has been trained on the [google/mobile-actions](https://huggingface.co/datasets/google/mobile-actions) dataset to perform structured function calling for common mobile device tasks.
18
+
19
+ ## Model Description
20
+
21
+ **Base Model**: `google/functiongemma-270m-it` - A 270M parameter instruction-tuned model from Google's FunctionGemma family, designed for function calling tasks.
22
+
23
+ **Specialization**: Mobile assistant actions including:
24
+ - Calendar event management
25
+ - Email composition and sending
26
+ - Contact creation
27
+ - Flashlight control
28
+ - Wi-Fi settings navigation
29
+ - Map location display
30
+
31
+ **Training Objective**: The model learns to emit structured function calls in the format `call:<function_name>{arg1:value1,arg2:value2,...}` instead of natural language responses.
32
+
33
+ ## Supported Functions
34
+
35
+ The model is optimized to call these mobile action functions:
36
+
37
+ 1. **`turn_on_flashlight()`** - Turns the device flashlight on
38
+ 2. **`turn_off_flashlight()`** - Turns the device flashlight off
39
+ 3. **`create_contact(first_name, last_name, phone_number?, email?)`** - Creates a new contact
40
+ 4. **`send_email(to, subject, body?)`** - Sends an email to a recipient
41
+ 5. **`show_map(query)`** - Displays a location on the map by name, business, or address
42
+ 6. **`open_wifi_settings()`** - Opens the Wi-Fi settings screen
43
+ 7. **`create_calendar_event(title, datetime)`** - Creates a calendar event (datetime in ISO format: `YYYY-MM-DDTHH:MM:SS`)
44
+
45
+ ## Training Details
46
+
47
+ ### Training Data
48
+
49
+ - **Dataset**: [google/mobile-actions](https://huggingface.co/datasets/google/mobile-actions)
50
+ - **Format**: JSONL with prompt-completion pairs
51
+ - **Splits**:
52
+ - Training set: examples with `"metadata": "train"`
53
+ - Evaluation set: examples with `"metadata": "eval"`
54
+ - **Preprocessing**: Converted to TRL prompt-completion format with `completion_only_loss=True`
55
+
56
+ ### Training Procedure
57
+
58
+ Fine-tuned using Hugging Face [TRL (Transformer Reinforcement Learning)](https://huggingface.co/docs/trl) with the `SFTTrainer`.
59
+
60
+ **Training Configuration**:
61
+ - **Epochs**: 2
62
+ - **Batch size**: 4 per device
63
+ - **Gradient accumulation steps**: 8
64
+ - **Learning rate**: 1e-5
65
+ - **Scheduler**: Cosine
66
+ - **Max sequence length**: 997 tokens (based on longest example: 897 tokens)
67
+ - **Optimizer**: AdamW (fused)
68
+ - **Precision**: bfloat16
69
+ - **Gradient checkpointing**: Enabled
70
+ - **Completion only loss**: True (trains only on model outputs, not prompts)
71
+
72
+ **Training Infrastructure**:
73
+ - **Hardware**: Google Colab A100 GPU
74
+ - **Training time**: ~20 minutes for 2 epochs
75
+ - **Library versions**: transformers==4.57.1, trl==0.25.1, datasets==4.4.1
76
+
77
+ ### Training Results
78
+
79
+ Final metrics after 2 epochs:
80
+
81
+ | Step | Training Loss | Validation Loss | Mean Token Accuracy |
82
+ |------|---------------|-----------------|---------------------|
83
+ | 500 | 0.008800 | 0.013452 | 0.996691 |
84
+
85
+ The model achieved 99.67% token-level accuracy on the validation set, showing significant improvement over the base model's mobile action capabilities.
86
+
87
+ ## Intended Use
88
+
89
+ This model is designed for:
90
+ - **Mobile AI assistants** that need to execute device actions based on user requests
91
+ - **Voice-controlled mobile applications**
92
+ - **Conversational agents** that interact with mobile device features
93
+ - **On-device AI** applications (can be converted to `.litertlm` format for deployment)
94
+
95
+ ## How to Use
96
+
97
+ ### Basic Inference
98
+
99
+ ```python
100
+ from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
101
+ import json
102
+
103
+ # Load model and tokenizer
104
+ model_id = "Jaypoirtier/functiongemma-270m-it-mobile-actions_jprtr"
105
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
106
+ model = AutoModelForCausalLM.from_pretrained(
107
+ model_id,
108
+ device_map="auto",
109
+ attn_implementation="eager",
110
+ torch_dtype="auto",
111
+ )
112
+
113
+ # Create pipeline
114
+ pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
115
+
116
+ # Define the tools (function schemas)
117
+ tools = [
118
+ {
119
+ "function": {
120
+ "name": "create_calendar_event",
121
+ "description": "Creates a new calendar event.",
122
+ "parameters": {
123
+ "type": "OBJECT",
124
+ "properties": {
125
+ "title": {"type": "STRING", "description": "The title of the event."},
126
+ "datetime": {"type": "STRING", "description": "The date and time in YYYY-MM-DDTHH:MM:SS format."},
127
+ },
128
+ "required": ["title", "datetime"],
129
+ },
130
+ }
131
+ },
132
+ {
133
+ "function": {
134
+ "name": "send_email",
135
+ "description": "Sends an email.",
136
+ "parameters": {
137
+ "type": "OBJECT",
138
+ "properties": {
139
+ "to": {"type": "STRING", "description": "The recipient email address."},
140
+ "subject": {"type": "STRING", "description": "The email subject."},
141
+ "body": {"type": "STRING", "description": "The email body."},
142
+ },
143
+ "required": ["to", "subject"],
144
+ },
145
+ }
146
+ },
147
+ # ... add other function definitions
148
+ ]
149
+
150
+ # Create messages
151
+ messages = [
152
+ {
153
+ "role": "developer",
154
+ "content": (
155
+ "Current date and time given in YYYY-MM-DDTHH:MM:SS format: 2025-07-10T19:06:29\n"
156
+ "Day of week is Thursday\n"
157
+ "You are a model that can do function calling with the following functions\n"
158
+ ),
159
+ },
160
+ {
161
+ "role": "user",
162
+ "content": 'Schedule a "team meeting" tomorrow at 4pm.',
163
+ },
164
+ ]
165
+
166
+ # Apply chat template
167
+ prompt = tokenizer.apply_chat_template(
168
+ messages,
169
+ tools=tools,
170
+ tokenize=False,
171
+ add_generation_prompt=True,
172
+ )
173
+
174
+ # Generate
175
+ output = pipe(prompt, max_new_tokens=200)[0]["generated_text"][len(prompt):].strip()
176
+ print("Model output:", output)
177
+ # Example output: call:create_calendar_event{datetime:2025-07-11T16:00:00,title:team meeting}
178
+ ```
179
+
180
+ ### Parsing Function Calls
181
+
182
+ The model outputs function calls in a simple format:
183
+ ```
184
+ call:<function_name>{arg1:value1,arg2:value2,...}
185
+ ```
186
+
187
+ For multiple function calls, they appear sequentially:
188
+ ```
189
+ call:create_calendar_event{datetime:2025-07-15T10:30:00,title:Dental Checkup}
190
+ call:send_email{to:user@example.com,subject:Appointment,body:See you there!}
191
+ ```
192
+
193
+ You can parse these by:
194
+ 1. Splitting on `call:` to identify individual function calls
195
+ 2. Extracting the function name (text before `{`)
196
+ 3. Parsing the arguments block (content within `{}`)
197
+
198
+ ## Evaluation
199
+
200
+ The model was evaluated on the held-out test set from the mobile-actions dataset. Evaluation metrics compare exact string matching of the model's function call outputs against ground truth labels.
201
+
202
+ **Key Observations**:
203
+ - The base FunctionGemma 270M model often fails to call appropriate functions for mobile actions
204
+ - After fine-tuning, the model reliably generates correct function calls with proper argument formatting
205
+ - Token-level accuracy on the validation set: **99.67%**
206
+
207
+ ## Limitations
208
+
209
+ - The model is specialized for the 7 mobile action functions listed above and may not generalize well to other function calling tasks
210
+ - Date/time parsing relies on context provided in the developer message (current date/time must be specified)
211
+ - The model outputs may occasionally include variations in argument formatting that are semantically correct but don't exactly match the expected format
212
+ - This is a 270M parameter model, so while efficient for mobile deployment, it may have lower accuracy than larger models
213
+
214
+ ## On-Device Deployment
215
+
216
+ The model can be converted to `.litertlm` format for on-device deployment using `ai-edge-torch`. See the [training notebook](https://colab.research.google.com/github/google-gemini/gemma-cookbook/blob/main/FunctionGemma/%5BFunctionGemma%5DFinetune_FunctionGemma_270M_for_Mobile_Actions_with_Hugging_Face.ipynb) for conversion instructions.
217
+
218
+ The converted model can be deployed on:
219
+ - Android devices via [Google AI Edge](https://ai.google.dev/edge)
220
+ - [AI Edge Gallery app](https://play.google.com/store/apps/details?id=com.google.ai.edge.gallery)
221
+
222
+ ## Training Notebook
223
+
224
+ For full training details, hyperparameter tuning, and evaluation, see the original Colab notebook:
225
+ [Finetune FunctionGemma 270M for Mobile Actions](https://colab.research.google.com/github/google-gemini/gemma-cookbook/blob/main/FunctionGemma/%5BFunctionGemma%5DFinetune_FunctionGemma_270M_for_Mobile_Actions_with_Hugging_Face.ipynb)
226
+
227
+ ## Citation
228
+
229
+ If you use this model, please cite the original FunctionGemma paper and the Google Mobile Actions dataset:
230
+
231
+ ```bibtex
232
+ @misc{functiongemma2024,
233
+ title={FunctionGemma: Function Calling for Gemma Models},
234
+ author={Google},
235
+ year={2024},
236
+ url={https://huggingface.co/google/functiongemma-270m-it}
237
+ }
238
+ ```
239
+
240
+ ## License
241
+
242
+ This model is released under the Gemma license. See the [Gemma Terms of Use](https://ai.google.dev/gemma/terms) for details.