Upload folder using huggingface_hub

bdfa860 verified 12 days ago

4.68 kB

	---
	license: apache-2.0
	base_model: google/functiongemma-270m-it
	tags:
	- function-calling
	- lora
	- peft
	- gemma
	- functiongemma
	- fine-tuned
	datasets:
	- custom
	---

	# FunctionGemma 270M - Fine-tuned for Python Function Calling

	This is a LoRA (Low-Rank Adaptation) fine-tuned version of [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it) for Python function calling tasks.

	## Model Details

	- Base Model: [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it)
	- Fine-tuning Method: LoRA (Low-Rank Adaptation)
	- Training Accuracy: 100% on test set
	- Model Size: 270M parameters (base) + LoRA adapters

	## Supported Functions

	The model is fine-tuned to call these 5 Python functions:

	1. is_prime(n) - Check if a number is prime
	2. is_factorial(n) - Compute factorial (n!)
	3. fibonacci(n) - Compute nth Fibonacci number
	4. gcd(a, b) - Greatest common divisor
	5. lcm(a, b) - Least common multiple

	## Usage

	### Load with LoRA Adapter

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from peft import PeftModel
	from huggingface_hub import login

	# Authenticate
	login()

	# Load base model
	base_model = AutoModelForCausalLM.from_pretrained(
	"google/functiongemma-270m-it",
	dtype="auto",
	device_map="auto",
	attn_implementation="eager",
	token=True
	)

	# Load LoRA adapter
	model = PeftModel.from_pretrained(base_model, "sandeeppanem/functiongemma-270m-lora")
	model.eval()

	# Load tokenizer
	tokenizer = AutoTokenizer.from_pretrained("sandeeppanem/functiongemma-270m-lora")
	```

	### Merge and Use (Recommended for Inference)

	```python
	# Merge adapter with base model for faster inference
	merged_model = model.merge_and_unload()

	# Save merged model
	merged_model.save_pretrained("./functiongemma-270m-merged")
	tokenizer.save_pretrained("./functiongemma-270m-merged")

	# Load merged model directly (no adapter needed)
	model = AutoModelForCausalLM.from_pretrained("./functiongemma-270m-merged")
	```

	### Example Inference

	```python
	# Define function schemas
	FUNCTION_SCHEMAS = [
	{
	"name": "gcd",
	"description": "Compute the greatest common divisor of two numbers",
	"parameters": {
	"type": "object",
	"properties": {
	"a": {"type": "integer", "description": "First number"},
	"b": {"type": "integer", "description": "Second number"}
	},
	"required": ["a", "b"]
	}
	},
	# ... other function schemas
	]

	# Convert to tools format
	tools = []
	for schema in FUNCTION_SCHEMAS:
	tools.append({
	"type": "function",
	"function": {
	"name": schema["name"],
	"description": schema["description"],
	"parameters": schema["parameters"],
	"return": {"type": "string"}
	}
	})

	# Create messages
	messages = [
	{
	"role": "developer",
	"content": "You are a model that can do function calling with the following functions",
	"tool_calls": None
	},
	{
	"role": "user",
	"content": "What is the GCD of 48 and 18?",
	"tool_calls": None
	}
	]

	# Apply chat template
	inputs = tokenizer.apply_chat_template(
	messages,
	tools=tools,
	add_generation_prompt=True,
	return_dict=True,
	return_tensors="pt"
	)

	# Generate
	outputs = model.generate(**inputs.to(model.device), max_new_tokens=128)
	response = tokenizer.decode(outputs[0][len(inputs["input_ids"][0]):], skip_special_tokens=False)
	print(response)
	```

	## Training Details

	- Training Data: Custom dataset with 183 training examples, 21 test examples
	- Training Split: 90% train, 10% test
	- Epochs: 6
	- Learning Rate: 2e-5
	- Batch Size: 4 (per device)
	- Gradient Accumulation Steps: 1
	- LoRA Config:
	- Rank (r): 8
	- Alpha: 16
	- Target modules: q_proj, v_proj, k_proj, o_proj
	- Dropout: 0.05
	- Bias: none

	## Evaluation

	- Test Accuracy: 100.0%
	- Baseline Accuracy: 81.0%
	- Improvement: +19.0%

	## Limitations

	- The model is fine-tuned specifically for the 5 mathematical functions listed above
	- It may not generalize well to other function calling tasks without additional fine-tuning
	- Requires access to the gated base model `google/functiongemma-270m-it`

	## Citation

	If you use this model, please cite:

	```bibtex
	@misc{functiongemma-270m-lora,
	title={FunctionGemma 270M - Fine-tuned for Python Function Calling},
	author={sandeeppanem},
	year={2025},
	url={https://huggingface.co/sandeeppanem/functiongemma-270m-lora}
	}
	```

	## License

	This model is licensed under Apache 2.0, same as the base model.