|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: google/functiongemma-270m-it |
|
|
tags: |
|
|
- function-calling |
|
|
- lora |
|
|
- peft |
|
|
- gemma |
|
|
- functiongemma |
|
|
- fine-tuned |
|
|
datasets: |
|
|
- custom |
|
|
--- |
|
|
|
|
|
# FunctionGemma 270M - Fine-tuned for Python Function Calling |
|
|
|
|
|
This is a LoRA (Low-Rank Adaptation) fine-tuned version of [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it) for Python function calling tasks. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Base Model**: [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it) |
|
|
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation) |
|
|
- **Training Accuracy**: 100% on test set |
|
|
- **Model Size**: 270M parameters (base) + LoRA adapters |
|
|
|
|
|
## Supported Functions |
|
|
|
|
|
The model is fine-tuned to call these 5 Python functions: |
|
|
|
|
|
1. **is_prime(n)** - Check if a number is prime |
|
|
2. **is_factorial(n)** - Compute factorial (n!) |
|
|
3. **fibonacci(n)** - Compute nth Fibonacci number |
|
|
4. **gcd(a, b)** - Greatest common divisor |
|
|
5. **lcm(a, b)** - Least common multiple |
|
|
|
|
|
## Usage |
|
|
|
|
|
### Load with LoRA Adapter |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
from peft import PeftModel |
|
|
from huggingface_hub import login |
|
|
|
|
|
# Authenticate |
|
|
login() |
|
|
|
|
|
# Load base model |
|
|
base_model = AutoModelForCausalLM.from_pretrained( |
|
|
"google/functiongemma-270m-it", |
|
|
dtype="auto", |
|
|
device_map="auto", |
|
|
attn_implementation="eager", |
|
|
token=True |
|
|
) |
|
|
|
|
|
# Load LoRA adapter |
|
|
model = PeftModel.from_pretrained(base_model, "sandeeppanem/functiongemma-270m-lora") |
|
|
model.eval() |
|
|
|
|
|
# Load tokenizer |
|
|
tokenizer = AutoTokenizer.from_pretrained("sandeeppanem/functiongemma-270m-lora") |
|
|
``` |
|
|
|
|
|
### Merge and Use (Recommended for Inference) |
|
|
|
|
|
```python |
|
|
# Merge adapter with base model for faster inference |
|
|
merged_model = model.merge_and_unload() |
|
|
|
|
|
# Save merged model |
|
|
merged_model.save_pretrained("./functiongemma-270m-merged") |
|
|
tokenizer.save_pretrained("./functiongemma-270m-merged") |
|
|
|
|
|
# Load merged model directly (no adapter needed) |
|
|
model = AutoModelForCausalLM.from_pretrained("./functiongemma-270m-merged") |
|
|
``` |
|
|
|
|
|
### Example Inference |
|
|
|
|
|
```python |
|
|
# Define function schemas |
|
|
FUNCTION_SCHEMAS = [ |
|
|
{ |
|
|
"name": "gcd", |
|
|
"description": "Compute the greatest common divisor of two numbers", |
|
|
"parameters": { |
|
|
"type": "object", |
|
|
"properties": { |
|
|
"a": {"type": "integer", "description": "First number"}, |
|
|
"b": {"type": "integer", "description": "Second number"} |
|
|
}, |
|
|
"required": ["a", "b"] |
|
|
} |
|
|
}, |
|
|
# ... other function schemas |
|
|
] |
|
|
|
|
|
# Convert to tools format |
|
|
tools = [] |
|
|
for schema in FUNCTION_SCHEMAS: |
|
|
tools.append({ |
|
|
"type": "function", |
|
|
"function": { |
|
|
"name": schema["name"], |
|
|
"description": schema["description"], |
|
|
"parameters": schema["parameters"], |
|
|
"return": {"type": "string"} |
|
|
} |
|
|
}) |
|
|
|
|
|
# Create messages |
|
|
messages = [ |
|
|
{ |
|
|
"role": "developer", |
|
|
"content": "You are a model that can do function calling with the following functions", |
|
|
"tool_calls": None |
|
|
}, |
|
|
{ |
|
|
"role": "user", |
|
|
"content": "What is the GCD of 48 and 18?", |
|
|
"tool_calls": None |
|
|
} |
|
|
] |
|
|
|
|
|
# Apply chat template |
|
|
inputs = tokenizer.apply_chat_template( |
|
|
messages, |
|
|
tools=tools, |
|
|
add_generation_prompt=True, |
|
|
return_dict=True, |
|
|
return_tensors="pt" |
|
|
) |
|
|
|
|
|
# Generate |
|
|
outputs = model.generate(**inputs.to(model.device), max_new_tokens=128) |
|
|
response = tokenizer.decode(outputs[0][len(inputs["input_ids"][0]):], skip_special_tokens=False) |
|
|
print(response) |
|
|
``` |
|
|
|
|
|
## Training Details |
|
|
|
|
|
- **Training Data**: Custom dataset with 183 training examples, 21 test examples |
|
|
- **Training Split**: 90% train, 10% test |
|
|
- **Epochs**: 6 |
|
|
- **Learning Rate**: 2e-5 |
|
|
- **Batch Size**: 4 (per device) |
|
|
- **Gradient Accumulation Steps**: 1 |
|
|
- **LoRA Config**: |
|
|
- Rank (r): 8 |
|
|
- Alpha: 16 |
|
|
- Target modules: q_proj, v_proj, k_proj, o_proj |
|
|
- Dropout: 0.05 |
|
|
- Bias: none |
|
|
|
|
|
## Evaluation |
|
|
|
|
|
- **Test Accuracy**: 100.0% |
|
|
- **Baseline Accuracy**: 81.0% |
|
|
- **Improvement**: +19.0% |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- The model is fine-tuned specifically for the 5 mathematical functions listed above |
|
|
- It may not generalize well to other function calling tasks without additional fine-tuning |
|
|
- Requires access to the gated base model `google/functiongemma-270m-it` |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model, please cite: |
|
|
|
|
|
```bibtex |
|
|
@misc{functiongemma-270m-lora, |
|
|
title={FunctionGemma 270M - Fine-tuned for Python Function Calling}, |
|
|
author={sandeeppanem}, |
|
|
year={2025}, |
|
|
url={https://huggingface.co/sandeeppanem/functiongemma-270m-lora} |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
This model is licensed under Apache 2.0, same as the base model. |
|
|
|