gemma-event-parser / README.md
sarvkk's picture
Update README.md
d091a8b verified
---
base_model: google/gemma-2-2b-it
library_name: peft
pipeline_tag: text-generation
tags:
- lora
- function-calling
- sports
- event-parsing
- natural-language-processing
license: gemma
language:
- en
---
# Gemma 2B Event Parser - Sports Event Function Calling
A fine-tuned LoRA adapter for Gemma 2B that converts natural language descriptions into structured JSON for creating sports events.
## Model Description
This model takes casual text like **"I want to play soccer this week Friday 4 PM @ Central Park"** and converts it into a properly formatted `CreateEventRequest` JSON object for backend API consumption.
**Base Model:** `google/gemma-2-2b-it`
**Fine-tuning Method:** LoRA (Low-Rank Adaptation)
**Training Framework:** Transformers + PEFT
**Primary Use Case:** Natural language to structured API requests for sports event creation
## Usage
### Installation
```bash
pip install transformers peft torch
```
### Quick Start
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
import json
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"google/gemma-2-2b-it",
device_map="auto",
dtype=torch.float16
)
# Load fine-tuned adapter
model = PeftModel.from_pretrained(base_model, "YOUR_USERNAME/gemma-event-parser")
tokenizer = AutoTokenizer.from_pretrained("YOUR_USERNAME/gemma-event-parser")
# Define function schema
function_schema = {
"name": "create_sports_event",
"description": "Create a new sports event from natural language description",
"parameters": {
"type": "object",
"properties": {
"sport": {"type": "string", "description": "Sport type (e.g., Soccer, Basketball, Tennis)"},
"venue_name": {"type": "string", "description": "Venue name"},
"start_time": {"type": "string", "description": "ISO 8601 format (e.g., 2026-02-07T16:00:00Z)"},
"max_participants": {"type": "integer", "default": 2},
"event_type": {
"type": "string",
"enum": ["Casual", "Light Training", "Looking to Improve", "Competitive Game"],
"default": "Casual"
}
},
"required": ["sport", "venue_name", "start_time"]
}
}
# Parse natural language
def parse_event(user_query):
prompt = f"""<start_of_turn>user
{user_query}
Available functions:
{json.dumps([function_schema], indent=2)}<end_of_turn>
<start_of_turn>model
"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.1,
do_sample=True,
top_p=0.95
)
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
# Extract JSON
start = result.find("<function_call>") + len("<function_call>")
end = result.find("</function_call>")
function_call = json.loads(result[start:end].strip())
return function_call["arguments"]
# Example
query = "I want to play soccer this week Friday 4 PM @ Central Park"
event_json = parse_event(query)
print(json.dumps(event_json, indent=2))
```
**Output:**
```json
{
"sport": "Soccer",
"venue_name": "Central Park",
"start_time": "2026-02-07T16:00:00Z",
"max_participants": 22,
"event_type": "Casual"
}
```
## Examples
| Input | Output |
|-------|--------|
| "Basketball game tomorrow 6pm at Riverside Courts, competitive" | `{"sport": "Basketball", "venue_name": "Riverside Courts", "start_time": "2026-02-07T18:00:00Z", "max_participants": 10, "event_type": "Competitive Game"}` |
| "Tennis match Wednesday 10 AM Ashburn Park, looking to improve" | `{"sport": "Tennis", "venue_name": "Ashburn Park", "start_time": "2026-02-12T10:00:00Z", "max_participants": 2, "event_type": "Looking to Improve"}` |
| "Casual volleyball Saturday 2pm Beach Courts" | `{"sport": "Volleyball", "venue_name": "Beach Courts", "start_time": "2026-02-08T14:00:00Z", "max_participants": 12, "event_type": "Casual"}` |
## Training Details
### Training Data
Fine-tuned on synthetic examples covering:
- Multiple sports (Soccer, Basketball, Tennis, Volleyball, Badminton, etc.)
- Various time formats (relative dates, specific times)
- All event types (Casual, Light Training, Looking to Improve, Competitive Game)
- Different venue patterns
**Training Size:** ~10-20 high-quality examples (LoRA requires less data)
### Training Hyperparameters
- **LoRA Rank (r):** 16
- **LoRA Alpha:** 32
- **Target Modules:** `q_proj, k_proj, v_proj, o_proj`
- **Learning Rate:** 2e-4
- **Epochs:** 20
- **Batch Size:** 2 (with gradient accumulation: 4)
- **Optimizer:** AdamW
- **Scheduler:** Cosine with warmup
- **Precision:** FP16
- **Training Time:** ~1-2 minutes on free Colab
### Framework Versions
- **Transformers:** 4.x
- **PEFT:** 0.18.1
- **PyTorch:** 2.x
- **Python:** 3.10+
## Limitations
- **Date Parsing:** Currently handles relative dates ("Friday", "tomorrow") but assumes current week context
- **Time Zones:** Defaults to UTC (Z suffix)
- **Sports Coverage:** Best performance on common sports; may need examples for niche sports
- **Language:** English only
## Intended Use
**Good for:**
- Converting casual user input to structured API requests
- Sports event management applications
- Voice-to-API integrations
- Chatbot backends for sports booking
**Not suitable for:**
- Mission-critical systems without validation
- Non-English languages
- Complex multi-event scheduling
- Historical date parsing
## License
This adapter follows the [Gemma License](https://ai.google.dev/gemma/terms). The base model is subject to Google's Gemma terms of use.
## Citation
If you use this model, please cite:
```bibtex
@misc{gemma-event-parser-2026,
author = {YOUR_NAME},
title = {Gemma 2B Event Parser - Sports Event Function Calling},
year = {2026},
publisher = {HuggingFace},
url = {https://huggingface.co/YOUR_USERNAME/gemma-event-parser}
}
```
## Acknowledgments
- Base model: Google's Gemma 2B-IT
- Fine-tuning framework: Hugging Face PEFT
- Training compute: Google Colab
---
**Questions?** Open an issue or discussion on this model's page!