File size: 6,198 Bytes
f41b5c4
 
 
 
 
 
d091a8b
 
 
 
 
 
 
f41b5c4
 
d091a8b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f41b5c4
 
 
 
 
d091a8b
 
 
 
 
f41b5c4
d091a8b
f41b5c4
d091a8b
f41b5c4
d091a8b
 
 
 
 
 
 
 
 
 
f41b5c4
d091a8b
f41b5c4
d091a8b
 
 
 
f41b5c4
d091a8b
f41b5c4
d091a8b
 
 
 
f41b5c4
d091a8b
f41b5c4
d091a8b
 
 
 
 
f41b5c4
d091a8b
 
 
 
 
f41b5c4
d091a8b
f41b5c4
d091a8b
f41b5c4
d091a8b
f41b5c4
d091a8b
 
 
 
 
 
 
 
 
 
f41b5c4
d091a8b
f41b5c4
d091a8b
 
 
f41b5c4
d091a8b
f41b5c4
d091a8b
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
---
base_model: google/gemma-2-2b-it
library_name: peft
pipeline_tag: text-generation
tags:
- lora
- function-calling
- sports
- event-parsing
- natural-language-processing
license: gemma
language:
- en
---

# Gemma 2B Event Parser - Sports Event Function Calling

A fine-tuned LoRA adapter for Gemma 2B that converts natural language descriptions into structured JSON for creating sports events.

## Model Description

This model takes casual text like **"I want to play soccer this week Friday 4 PM @ Central Park"** and converts it into a properly formatted `CreateEventRequest` JSON object for backend API consumption.

**Base Model:** `google/gemma-2-2b-it`  
**Fine-tuning Method:** LoRA (Low-Rank Adaptation)  
**Training Framework:** Transformers + PEFT  
**Primary Use Case:** Natural language to structured API requests for sports event creation

## Usage

### Installation
```bash
pip install transformers peft torch
```

### Quick Start
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
import json

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "google/gemma-2-2b-it",
    device_map="auto",
    dtype=torch.float16
)

# Load fine-tuned adapter
model = PeftModel.from_pretrained(base_model, "YOUR_USERNAME/gemma-event-parser")
tokenizer = AutoTokenizer.from_pretrained("YOUR_USERNAME/gemma-event-parser")

# Define function schema
function_schema = {
    "name": "create_sports_event",
    "description": "Create a new sports event from natural language description",
    "parameters": {
        "type": "object",
        "properties": {
            "sport": {"type": "string", "description": "Sport type (e.g., Soccer, Basketball, Tennis)"},
            "venue_name": {"type": "string", "description": "Venue name"},
            "start_time": {"type": "string", "description": "ISO 8601 format (e.g., 2026-02-07T16:00:00Z)"},
            "max_participants": {"type": "integer", "default": 2},
            "event_type": {
                "type": "string",
                "enum": ["Casual", "Light Training", "Looking to Improve", "Competitive Game"],
                "default": "Casual"
            }
        },
        "required": ["sport", "venue_name", "start_time"]
    }
}

# Parse natural language
def parse_event(user_query):
    prompt = f"""<start_of_turn>user
{user_query}

Available functions:
{json.dumps([function_schema], indent=2)}<end_of_turn>
<start_of_turn>model
"""
    
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    outputs = model.generate(
        **inputs,
        max_new_tokens=256,
        temperature=0.1,
        do_sample=True,
        top_p=0.95
    )
    
    result = tokenizer.decode(outputs[0], skip_special_tokens=True)
    
    # Extract JSON
    start = result.find("<function_call>") + len("<function_call>")
    end = result.find("</function_call>")
    function_call = json.loads(result[start:end].strip())
    
    return function_call["arguments"]

# Example
query = "I want to play soccer this week Friday 4 PM @ Central Park"
event_json = parse_event(query)
print(json.dumps(event_json, indent=2))
```

**Output:**
```json
{
  "sport": "Soccer",
  "venue_name": "Central Park",
  "start_time": "2026-02-07T16:00:00Z",
  "max_participants": 22,
  "event_type": "Casual"
}
```

## Examples

| Input | Output |
|-------|--------|
| "Basketball game tomorrow 6pm at Riverside Courts, competitive" | `{"sport": "Basketball", "venue_name": "Riverside Courts", "start_time": "2026-02-07T18:00:00Z", "max_participants": 10, "event_type": "Competitive Game"}` |
| "Tennis match Wednesday 10 AM Ashburn Park, looking to improve" | `{"sport": "Tennis", "venue_name": "Ashburn Park", "start_time": "2026-02-12T10:00:00Z", "max_participants": 2, "event_type": "Looking to Improve"}` |
| "Casual volleyball Saturday 2pm Beach Courts" | `{"sport": "Volleyball", "venue_name": "Beach Courts", "start_time": "2026-02-08T14:00:00Z", "max_participants": 12, "event_type": "Casual"}` |

## Training Details

### Training Data

Fine-tuned on synthetic examples covering:
- Multiple sports (Soccer, Basketball, Tennis, Volleyball, Badminton, etc.)
- Various time formats (relative dates, specific times)
- All event types (Casual, Light Training, Looking to Improve, Competitive Game)
- Different venue patterns

**Training Size:** ~10-20 high-quality examples (LoRA requires less data)

### Training Hyperparameters

- **LoRA Rank (r):** 16
- **LoRA Alpha:** 32
- **Target Modules:** `q_proj, k_proj, v_proj, o_proj`
- **Learning Rate:** 2e-4
- **Epochs:** 20
- **Batch Size:** 2 (with gradient accumulation: 4)
- **Optimizer:** AdamW
- **Scheduler:** Cosine with warmup
- **Precision:** FP16
- **Training Time:** ~1-2 minutes on free Colab

### Framework Versions

- **Transformers:** 4.x
- **PEFT:** 0.18.1
- **PyTorch:** 2.x
- **Python:** 3.10+

## Limitations

- **Date Parsing:** Currently handles relative dates ("Friday", "tomorrow") but assumes current week context
- **Time Zones:** Defaults to UTC (Z suffix)
- **Sports Coverage:** Best performance on common sports; may need examples for niche sports
- **Language:** English only

## Intended Use**Good for:**
- Converting casual user input to structured API requests
- Sports event management applications
- Voice-to-API integrations
- Chatbot backends for sports booking

❌ **Not suitable for:**
- Mission-critical systems without validation
- Non-English languages
- Complex multi-event scheduling
- Historical date parsing

## License

This adapter follows the [Gemma License](https://ai.google.dev/gemma/terms). The base model is subject to Google's Gemma terms of use.

## Citation

If you use this model, please cite:
```bibtex
@misc{gemma-event-parser-2026,
  author = {YOUR_NAME},
  title = {Gemma 2B Event Parser - Sports Event Function Calling},
  year = {2026},
  publisher = {HuggingFace},
  url = {https://huggingface.co/YOUR_USERNAME/gemma-event-parser}
}
```

## Acknowledgments

- Base model: Google's Gemma 2B-IT
- Fine-tuning framework: Hugging Face PEFT
- Training compute: Google Colab

---

**Questions?** Open an issue or discussion on this model's page!