Gemma 2B Event Parser - Sports Event Function Calling

A fine-tuned LoRA adapter for Gemma 2B that converts natural language descriptions into structured JSON for creating sports events.

Model Description

This model takes casual text like "I want to play soccer this week Friday 4 PM @ Central Park" and converts it into a properly formatted CreateEventRequest JSON object for backend API consumption.

Base Model: google/gemma-2-2b-it
Fine-tuning Method: LoRA (Low-Rank Adaptation)
Training Framework: Transformers + PEFT
Primary Use Case: Natural language to structured API requests for sports event creation

Usage

Installation

pip install transformers peft torch

Quick Start

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
import json

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "google/gemma-2-2b-it",
    device_map="auto",
    dtype=torch.float16
)

# Load fine-tuned adapter
model = PeftModel.from_pretrained(base_model, "YOUR_USERNAME/gemma-event-parser")
tokenizer = AutoTokenizer.from_pretrained("YOUR_USERNAME/gemma-event-parser")

# Define function schema
function_schema = {
    "name": "create_sports_event",
    "description": "Create a new sports event from natural language description",
    "parameters": {
        "type": "object",
        "properties": {
            "sport": {"type": "string", "description": "Sport type (e.g., Soccer, Basketball, Tennis)"},
            "venue_name": {"type": "string", "description": "Venue name"},
            "start_time": {"type": "string", "description": "ISO 8601 format (e.g., 2026-02-07T16:00:00Z)"},
            "max_participants": {"type": "integer", "default": 2},
            "event_type": {
                "type": "string",
                "enum": ["Casual", "Light Training", "Looking to Improve", "Competitive Game"],
                "default": "Casual"
            }
        },
        "required": ["sport", "venue_name", "start_time"]
    }
}

# Parse natural language
def parse_event(user_query):
    prompt = f"""<start_of_turn>user
{user_query}

Available functions:
{json.dumps([function_schema], indent=2)}<end_of_turn>
<start_of_turn>model
"""
    
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    outputs = model.generate(
        **inputs,
        max_new_tokens=256,
        temperature=0.1,
        do_sample=True,
        top_p=0.95
    )
    
    result = tokenizer.decode(outputs[0], skip_special_tokens=True)
    
    # Extract JSON
    start = result.find("<function_call>") + len("<function_call>")
    end = result.find("</function_call>")
    function_call = json.loads(result[start:end].strip())
    
    return function_call["arguments"]

# Example
query = "I want to play soccer this week Friday 4 PM @ Central Park"
event_json = parse_event(query)
print(json.dumps(event_json, indent=2))

Output:

{
  "sport": "Soccer",
  "venue_name": "Central Park",
  "start_time": "2026-02-07T16:00:00Z",
  "max_participants": 22,
  "event_type": "Casual"
}

Examples

Input Output
"Basketball game tomorrow 6pm at Riverside Courts, competitive" {"sport": "Basketball", "venue_name": "Riverside Courts", "start_time": "2026-02-07T18:00:00Z", "max_participants": 10, "event_type": "Competitive Game"}
"Tennis match Wednesday 10 AM Ashburn Park, looking to improve" {"sport": "Tennis", "venue_name": "Ashburn Park", "start_time": "2026-02-12T10:00:00Z", "max_participants": 2, "event_type": "Looking to Improve"}
"Casual volleyball Saturday 2pm Beach Courts" {"sport": "Volleyball", "venue_name": "Beach Courts", "start_time": "2026-02-08T14:00:00Z", "max_participants": 12, "event_type": "Casual"}

Training Details

Training Data

Fine-tuned on synthetic examples covering:

  • Multiple sports (Soccer, Basketball, Tennis, Volleyball, Badminton, etc.)
  • Various time formats (relative dates, specific times)
  • All event types (Casual, Light Training, Looking to Improve, Competitive Game)
  • Different venue patterns

Training Size: ~10-20 high-quality examples (LoRA requires less data)

Training Hyperparameters

  • LoRA Rank (r): 16
  • LoRA Alpha: 32
  • Target Modules: q_proj, k_proj, v_proj, o_proj
  • Learning Rate: 2e-4
  • Epochs: 20
  • Batch Size: 2 (with gradient accumulation: 4)
  • Optimizer: AdamW
  • Scheduler: Cosine with warmup
  • Precision: FP16
  • Training Time: ~1-2 minutes on free Colab

Framework Versions

  • Transformers: 4.x
  • PEFT: 0.18.1
  • PyTorch: 2.x
  • Python: 3.10+

Limitations

  • Date Parsing: Currently handles relative dates ("Friday", "tomorrow") but assumes current week context
  • Time Zones: Defaults to UTC (Z suffix)
  • Sports Coverage: Best performance on common sports; may need examples for niche sports
  • Language: English only

Intended Use

โœ… Good for:

  • Converting casual user input to structured API requests
  • Sports event management applications
  • Voice-to-API integrations
  • Chatbot backends for sports booking

โŒ Not suitable for:

  • Mission-critical systems without validation
  • Non-English languages
  • Complex multi-event scheduling
  • Historical date parsing

License

This adapter follows the Gemma License. The base model is subject to Google's Gemma terms of use.

Citation

If you use this model, please cite:

@misc{gemma-event-parser-2026,
  author = {YOUR_NAME},
  title = {Gemma 2B Event Parser - Sports Event Function Calling},
  year = {2026},
  publisher = {HuggingFace},
  url = {https://huggingface.co/YOUR_USERNAME/gemma-event-parser}
}

Acknowledgments

  • Base model: Google's Gemma 2B-IT
  • Fine-tuning framework: Hugging Face PEFT
  • Training compute: Google Colab

Questions? Open an issue or discussion on this model's page!

Downloads last month
23
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for sarvkk/gemma-event-parser

Base model

google/gemma-2-2b
Adapter
(370)
this model