sarvkk commited on
Commit
c8f7ff7
·
verified ·
1 Parent(s): 483ec8d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +213 -148
README.md CHANGED
@@ -8,202 +8,267 @@ tags:
8
  - sft
9
  - transformers
10
  - trl
 
 
 
 
 
 
 
11
  ---
12
 
13
- # Model Card for Model ID
14
-
15
- <!-- Provide a quick summary of what the model is/does. -->
16
 
 
17
 
 
18
 
19
  ## Model Details
20
 
21
  ### Model Description
22
 
23
- <!-- Provide a longer summary of what this model is. -->
24
-
25
-
26
-
27
- - **Developed by:** [More Information Needed]
28
- - **Funded by [optional]:** [More Information Needed]
29
- - **Shared by [optional]:** [More Information Needed]
30
- - **Model type:** [More Information Needed]
31
- - **Language(s) (NLP):** [More Information Needed]
32
- - **License:** [More Information Needed]
33
- - **Finetuned from model [optional]:** [More Information Needed]
34
-
35
- ### Model Sources [optional]
36
-
37
- <!-- Provide the basic links for the model. -->
38
-
39
- - **Repository:** [More Information Needed]
40
- - **Paper [optional]:** [More Information Needed]
41
- - **Demo [optional]:** [More Information Needed]
42
-
43
- ## Uses
44
-
45
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
46
-
47
- ### Direct Use
48
-
49
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
50
-
51
- [More Information Needed]
52
-
53
- ### Downstream Use [optional]
54
-
55
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
56
-
57
- [More Information Needed]
58
-
59
- ### Out-of-Scope Use
60
-
61
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
62
-
63
- [More Information Needed]
64
-
65
- ## Bias, Risks, and Limitations
66
-
67
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
68
-
69
- [More Information Needed]
70
-
71
- ### Recommendations
72
-
73
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
74
-
75
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
76
-
77
- ## How to Get Started with the Model
78
-
79
- Use the code below to get started with the model.
80
-
81
- [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
82
 
83
  ## Training Details
84
 
85
  ### Training Data
86
 
87
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
88
 
89
- [More Information Needed]
 
 
 
 
90
 
91
- ### Training Procedure
92
 
93
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
94
 
95
- #### Preprocessing [optional]
96
 
97
- [More Information Needed]
98
 
 
 
 
 
 
 
 
 
99
 
100
  #### Training Hyperparameters
101
 
102
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
103
-
104
- #### Speeds, Sizes, Times [optional]
105
-
106
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
107
-
108
- [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
109
 
110
  ## Evaluation
111
 
112
- <!-- This section describes the evaluation protocols and provides the results. -->
113
-
114
- ### Testing Data, Factors & Metrics
115
-
116
- #### Testing Data
117
-
118
- <!-- This should link to a Dataset Card if possible. -->
119
-
120
- [More Information Needed]
121
-
122
- #### Factors
123
-
124
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
125
 
126
- [More Information Needed]
127
 
128
- #### Metrics
129
 
130
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
131
-
132
- [More Information Needed]
133
 
134
  ### Results
135
 
136
- [More Information Needed]
137
-
138
- #### Summary
139
-
140
-
141
 
142
- ## Model Examination [optional]
 
 
 
 
 
 
 
143
 
144
- <!-- Relevant interpretability work for the model goes here -->
145
 
146
- [More Information Needed]
147
 
148
- ## Environmental Impact
 
 
 
 
 
149
 
150
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
151
 
152
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
153
-
154
- - **Hardware Type:** [More Information Needed]
155
- - **Hours used:** [More Information Needed]
156
- - **Cloud Provider:** [More Information Needed]
157
- - **Compute Region:** [More Information Needed]
158
- - **Carbon Emitted:** [More Information Needed]
159
-
160
- ## Technical Specifications [optional]
161
 
162
  ### Model Architecture and Objective
163
 
164
- [More Information Needed]
 
165
 
166
  ### Compute Infrastructure
167
 
168
- [More Information Needed]
169
-
170
  #### Hardware
171
 
172
- [More Information Needed]
 
173
 
174
  #### Software
175
 
176
- [More Information Needed]
177
-
178
- ## Citation [optional]
179
-
180
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
181
-
182
- **BibTeX:**
183
-
184
- [More Information Needed]
185
-
186
- **APA:**
187
-
188
- [More Information Needed]
189
-
190
- ## Glossary [optional]
191
-
192
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
193
-
194
- [More Information Needed]
195
-
196
- ## More Information [optional]
197
-
198
- [More Information Needed]
199
-
200
- ## Model Card Authors [optional]
201
-
202
- [More Information Needed]
203
-
204
- ## Model Card Contact
205
 
206
- [More Information Needed]
207
- ### Framework versions
208
 
209
- - PEFT 0.18.1
 
8
  - sft
9
  - transformers
10
  - trl
11
+ - function-calling
12
+ - sports
13
+ - event-parser
14
+ - gemma
15
+ language:
16
+ - en
17
+ license: apache-2.0
18
  ---
19
 
20
+ # FunctionGemma 270M — Sports Event Parser
 
 
21
 
22
+ A lightweight LoRA fine-tune of [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it) that converts natural language sports event requests into structured `create_sports_event` function calls with proper ISO 8601 timestamps and timezone handling.
23
 
24
+ > *"Soccer this Friday 4pm @ Central Park"* → `{"sport": "Soccer", "venue_name": "Central Park", "start_time": "2026-02-13T16:00:00-05:00", ...}`
25
 
26
  ## Model Details
27
 
28
  ### Model Description
29
 
30
+ This adapter teaches a 270M-parameter Gemma model to act as a **function-calling layer** for sports event creation. Given a user's natural language request plus their timezone, the model outputs a JSON function call with the correct sport, venue, date (resolved from relative references like "tomorrow", "this Friday", "next Monday"), time in ISO 8601 with timezone offset, participant count, and event type.
31
+
32
+ - **Developed by:** [sarvkk](https://huggingface.co/sarvkk)
33
+ - **Model type:** Causal Language Model (LoRA adapter)
34
+ - **Language(s):** English
35
+ - **License:** Apache 2.0
36
+ - **Base model:** [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it) (270M parameters)
37
+ - **Adapter size:** ~0.5% of base model parameters
38
+
39
+ ### Key Features
40
+
41
+ - **Relative date resolution** — understands "tomorrow", "this Friday", "next Monday", "Saturday", etc.
42
+ - **Multi-timezone support** — outputs correct UTC offsets for America/New_York, America/Los_Angeles, America/Chicago, Europe/London, Asia/Tokyo, Australia/Sydney, and more
43
+ - **ISO 8601 timestamps** — e.g. `2026-02-07T16:00:00-05:00`
44
+ - **Structured output** — consistently produces valid JSON function calls
45
+
46
+ ## How to Get Started
47
+
48
+ > **Important:** Must use `bfloat16` — Gemma's RMSNorm produces NaN in fp16.
49
+
50
+ ```python
51
+ from transformers import AutoTokenizer, AutoModelForCausalLM
52
+ from peft import PeftModel
53
+ import torch
54
+ import json
55
+ from datetime import datetime, timedelta
56
+ from zoneinfo import ZoneInfo
57
+
58
+ BASE_MODEL = "google/functiongemma-270m-it"
59
+ ADAPTER_REPO = "sarvkk/funcgemma-event-parser-v2"
60
+
61
+ device = "cuda" if torch.cuda.is_available() else "cpu"
62
+
63
+ # Load base + LoRA adapter
64
+ base_model = AutoModelForCausalLM.from_pretrained(
65
+ BASE_MODEL,
66
+ device_map={"": device},
67
+ dtype=torch.bfloat16,
68
+ attn_implementation="eager",
69
+ low_cpu_mem_usage=True,
70
+ )
71
+ model = PeftModel.from_pretrained(base_model, ADAPTER_REPO, device_map={"": device})
72
+ tokenizer = AutoTokenizer.from_pretrained(ADAPTER_REPO)
73
+ model.eval()
74
+
75
+ # Function schema
76
+ FUNCTION_SCHEMA = {
77
+ "name": "create_sports_event",
78
+ "description": "Create a new sports event from natural language description",
79
+ "parameters": {
80
+ "type": "object",
81
+ "properties": {
82
+ "sport": {"type": "string", "description": "Type of sport"},
83
+ "venue_name": {"type": "string", "description": "Name of the venue"},
84
+ "start_time": {"type": "string", "description": "ISO 8601 with timezone"},
85
+ "max_participants": {"type": "integer", "default": 2},
86
+ "event_type": {
87
+ "type": "string",
88
+ "enum": ["Casual", "Light Training", "Looking to Improve", "Competitive Game"],
89
+ },
90
+ },
91
+ "required": ["sport", "venue_name", "start_time"],
92
+ },
93
+ }
94
+
95
+ # Build prompt with date context
96
+ now = datetime.now()
97
+ today_str = now.strftime("%Y-%m-%d")
98
+ today_day = now.strftime("%A")
99
+ current_time = now.strftime("%H:%M")
100
+ tomorrow_str = (now + timedelta(days=1)).strftime("%Y-%m-%d")
101
+ user_timezone = "America/New_York"
102
+ tz = ZoneInfo(user_timezone)
103
+ offset = now.replace(hour=12, tzinfo=tz).strftime("%z")
104
+ tz_offset = f"{offset[:3]}:{offset[3:]}"
105
+
106
+ user_query = "Soccer this Friday 4pm @ Central Park"
107
+
108
+ prompt = f"""<start_of_turn>user
109
+ Current date and time: {today_str} ({today_day}) at {current_time}
110
+ User timezone: {user_timezone} (UTC{tz_offset})
111
+
112
+ User request: {user_query}
113
+
114
+ Available functions:
115
+ {json.dumps([FUNCTION_SCHEMA], indent=2)}
116
+
117
+ Important:
118
+ - Calculate dates relative to {today_str}
119
+ - "tomorrow" = {tomorrow_str}
120
+ - "Friday" = the next upcoming Friday from {today_str}
121
+ - All times should be in ISO 8601 format with timezone offset
122
+ - Example: "{tomorrow_str}T16:00:00{tz_offset}"
123
+ <end_of_turn>
124
+ <start_of_turn>model
125
+ """
126
+
127
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
128
+ with torch.no_grad():
129
+ outputs = model.generate(
130
+ **inputs,
131
+ max_new_tokens=300,
132
+ do_sample=False,
133
+ pad_token_id=tokenizer.eos_token_id,
134
+ )
135
+
136
+ result = tokenizer.decode(outputs[0], skip_special_tokens=True)
137
+ start = result.find("<function_call>") + len("<function_call>")
138
+ end = result.find("</function_call>")
139
+ if end == -1:
140
+ end = len(result)
141
+ parsed = json.loads(result[start:end].strip())
142
+ print(json.dumps(parsed, indent=2))
143
+ ```
144
+
145
+ ### Example Output
146
+
147
+ ```json
148
+ {
149
+ "name": "create_sports_event",
150
+ "arguments": {
151
+ "sport": "Soccer",
152
+ "venue_name": "Central Park",
153
+ "start_time": "2026-02-13T16:00:00-05:00",
154
+ "max_participants": 22,
155
+ "event_type": "Casual"
156
+ }
157
+ }
158
+ ```
159
 
160
  ## Training Details
161
 
162
  ### Training Data
163
 
164
+ Synthetically generated dataset of ~600 examples covering:
165
 
166
+ - **25 query templates** per reference date — varied sports (Soccer, Basketball, Tennis, Volleyball, Running, Swimming, Cycling, Golf, Hockey, Badminton, Yoga), venues, times, and phrasing styles
167
+ - **6 reference dates** across February 2026 (covering different days of the week)
168
+ - **4 user timezones**: America/New_York, America/Los_Angeles, America/Chicago, Europe/London
169
+ - **Additional timezone examples**: Asia/Tokyo, Australia/Sydney embedded in the training set
170
+ - **Event types**: Casual, Light Training, Looking to Improve, Competitive Game
171
 
172
+ Each example includes the current date context in the prompt so the model learns to resolve relative dates ("tomorrow", "this Friday", "next Monday") dynamically.
173
 
174
+ ### Training Procedure
175
 
176
+ Fine-tuned using [TRL](https://github.com/huggingface/trl)'s `SFTTrainer` with LoRA adapters via [PEFT](https://github.com/huggingface/peft).
177
 
178
+ #### LoRA Configuration
179
 
180
+ | Parameter | Value |
181
+ |---|---|
182
+ | Rank (r) | 16 |
183
+ | Alpha | 32 |
184
+ | Dropout | 0.05 |
185
+ | Target modules | `q_proj`, `k_proj`, `v_proj`, `o_proj` |
186
+ | Bias | none |
187
+ | Task type | CAUSAL_LM |
188
 
189
  #### Training Hyperparameters
190
 
191
+ | Parameter | Value |
192
+ |---|---|
193
+ | **Training regime** | bf16 (bfloat16 non-mixed precision) |
194
+ | Epochs | 2 |
195
+ | Batch size | 1 (per device) |
196
+ | Gradient accumulation steps | 8 (effective batch = 8) |
197
+ | Learning rate | 2e-4 |
198
+ | Warmup steps | 20 |
199
+ | Optimizer | AdamW (torch) |
200
+ | Max sequence length | 1024 |
201
+ | Save strategy | per epoch |
202
+
203
+ #### Speeds, Sizes, Times
204
+
205
+ | Metric | Value |
206
+ |---|---|
207
+ | Training steps | 144 |
208
+ | Training time | ~8 minutes |
209
+ | Adapter size | ~1.7 MB |
210
+ | Base model parameters | 270M |
211
+ | Trainable parameters | ~0.5% of base |
212
 
213
  ## Evaluation
214
 
215
+ ### Testing Data
 
 
 
 
 
 
 
 
 
 
 
 
216
 
217
+ 6 held-out queries with unseen venue names across 5 timezones, testing relative date resolution and diverse sports.
218
 
219
+ ### Metrics
220
 
221
+ - **Parse success rate**: Whether the model output is valid JSON with the correct function name
222
+ - **Field accuracy**: Correct sport, venue, date, time, and timezone offset
 
223
 
224
  ### Results
225
 
226
+ **6/6 queries parsed successfully (100% parse rate)**
 
 
 
 
227
 
228
+ | Query | Timezone | Sport | Venue | Time | ✓ |
229
+ |---|---|---|---|---|---|
230
+ | Soccer this Friday 4pm @ Washington Square Park | America/New_York | Soccer | Washington Square Park | 2026-02-13T16:00:00-05:00 | ✓ |
231
+ | Basketball tomorrow 6pm at Barclays Center | America/New_York | Basketball | Barclays Center | 2026-02-07T18:00:00-05:00 | ✓ |
232
+ | Beach volleyball Saturday 2pm at Santa Monica Beach | America/Los_Angeles | Beach Volleyball | Santa Monica Beach | 2026-02-07T14:00:00-08:00 | ✓ |
233
+ | Pickup basketball Friday 6pm at Wrigley Field | America/Chicago | Basketball | Wrigley Field | 2026-02-07T18:00:00-06:00 | ✓ |
234
+ | Football Saturday 3pm at Hampstead Heath | Europe/London | Football | Hampstead Heath | 2026-02-13T15:00:00+00:00 | ✓ |
235
+ | Tennis next Monday 10am at Central Park Tennis Courts | America/New_York | Tennis | Central Park Tennis Courts | 2026-02-07T10:00:00-05:00 | ✓ |
236
 
237
+ ### Benchmark vs Gemma-2 2B
238
 
239
+ Compared against a [Gemma-2 2B LoRA adapter](https://huggingface.co/sarvkk/gemma-event-parser-v2) trained on the same task, on a Tesla T4 GPU:
240
 
241
+ | Metric | FuncGemma 270M | Gemma-2 2B | Winner |
242
+ |---|---|---|---|
243
+ | Model load time | 5.14s | 8.36s | **270M** (1.6× faster) |
244
+ | Avg inference time | 5.584s | 7.182s | **270M** (1.3× faster) |
245
+ | Tokens/sec | 20.1 | 15.6 | **270M** |
246
+ | Parse success | 6/6 | 6/6 | Tie |
247
 
248
+ The 270M model matches the 2B model on accuracy while being **1.3× faster at inference** and **1.6× faster to load**, with ~7× fewer parameters.
249
 
250
+ ## Technical Specifications
 
 
 
 
 
 
 
 
251
 
252
  ### Model Architecture and Objective
253
 
254
+ - **Architecture:** Gemma 270M (decoder-only transformer) with LoRA adapters on attention projection layers
255
+ - **Objective:** Causal language modeling (next-token prediction) fine-tuned for structured function-call generation
256
 
257
  ### Compute Infrastructure
258
 
 
 
259
  #### Hardware
260
 
261
+ - **GPU:** NVIDIA Tesla T4 (15.8 GB VRAM)
262
+ - **Platform:** Lightning AI Studio
263
 
264
  #### Software
265
 
266
+ - **Transformers:** latest
267
+ - **PEFT:** 0.18.1
268
+ - **TRL:** latest
269
+ - **PyTorch:** 2.x with CUDA
270
+ - **Python:** 3.12
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
271
 
272
+ ## Framework Versions
 
273
 
274
+ - PEFT 0.18.1