xkcd-functiongemma / README.md
gnumanth's picture
chore: update readme
7498e73 verified
---
language:
- en
license: apache-2.0
library_name: transformers
tags:
- function-calling
- xkcd
- gemma
- unsloth
- peft
- lora
base_model: google/functiongemma-270m-it
datasets:
- olivierdehaene/xkcd
pipeline_tag: text-generation
---
# XKCD FunctionGemma
A fine-tuned version of [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it) for XKCD comic search function calling.
## Model Description
This model was fine-tuned to generate structured function calls for searching XKCD comics. Given a natural language query about comics, it outputs a properly formatted tool call that can be parsed and executed.
**Base model:** `google/functiongemma-270m-it`
**Fine-tuning method:** LoRA via Unsloth (1.4% trainable parameters)
**Training data:** 2,630 examples from [olivierdehaene/xkcd](https://huggingface.co/datasets/olivierdehaene/xkcd)
**Training time:** ~8 minutes on T4 GPU
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import json
import re
# Load model
model = AutoModelForCausalLM.from_pretrained(
"gnumanth/xkcd-functiongemma",
device_map="auto",
torch_dtype="auto"
)
tokenizer = AutoTokenizer.from_pretrained("gnumanth/xkcd-functiongemma")
# Define tools
TOOLS = [{
"type": "function",
"function": {
"name": "search_xkcd",
"description": "Search XKCD comics by topic",
"parameters": {
"type": "object",
"properties": {"query": {"type": "string"}},
"required": ["query"]
}
}
}]
# Generate function call
messages = [{"role": "user", "content": "Find xkcd about programming"}]
text = tokenizer.apply_chat_template(messages, tools=TOOLS, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=128, do_sample=False)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
# Output: <start_function_call>call:search_xkcd{"query": "programming"}<end_function_call>
```
## Parsing Function Calls
```python
def parse_function_call(output: str) -> dict | None:
"""Extract function name and arguments from model output."""
match = re.search(r'call:(\w+)\s*\{(.+)\}', output, re.DOTALL)
if not match:
return None
func_name = match.group(1)
args_raw = match.group(2).strip()
# Handle double braces from training format
args_raw = re.sub(r'^\s*\{', '', args_raw)
if args_raw.endswith('}'):
args_raw = args_raw[:-1]
try:
return {"function": func_name, "arguments": json.loads('{' + args_raw + '}')}
except json.JSONDecodeError:
return None
# Usage
call = parse_function_call(response)
# {'function': 'search_xkcd', 'arguments': {'query': 'programming'}}
```
## Training Details
- **Epochs:** 1
- **Batch size:** 2 (with 4 gradient accumulation steps)
- **Learning rate:** 2e-4
- **LoRA rank:** 16
- **Target modules:** q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- **Final loss:** 0.281
## Limitations
- Only trained for XKCD search queries
- May produce double braces in output (handled by parser above)
- Small model (270M params) - limited reasoning capability
## License
Apache 2.0 (same as base model)
## Links
- [Training notebook](https://github.com/hemanth/notebooks/blob/main/notebooks/functiongemma_xkcd_finetune.ipynb)
- [Base model](https://huggingface.co/google/functiongemma-270m-it)
- [XKCD dataset](https://huggingface.co/datasets/olivierdehaene/xkcd)