|
|
--- |
|
|
language: |
|
|
- en |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- function-calling |
|
|
- music |
|
|
- gemma |
|
|
- peft |
|
|
- lora |
|
|
base_model: google/functiongemma-270m-it |
|
|
--- |
|
|
|
|
|
# FunctionGemma Music Assistant (2-Function) |
|
|
|
|
|
A fine-tuned version of [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it) for music playback control. |
|
|
This model supports 2 core music functions with 100% accuracy. |
|
|
|
|
|
## Model Description |
|
|
|
|
|
This is a **LoRA-adapted** FunctionGemma model trained specifically for music function calling. |
|
|
The model generates function calls in the FunctionGemma format for controlling music playback. |
|
|
|
|
|
**Training Details:** |
|
|
- Base Model: google/functiongemma-270m-it (270M parameters) |
|
|
- Fine-tuning Method: LoRA (Low-Rank Adaptation) |
|
|
- Parameters Trained: 3.8M (1.40% of total) |
|
|
- Training Examples: 60 (30 per function) |
|
|
- Training Time: ~1 minute |
|
|
- Accuracy: 100% (5/5 test cases) |
|
|
|
|
|
## Supported Functions |
|
|
|
|
|
### 1. play_song |
|
|
Play a specific song by name or artist. |
|
|
|
|
|
**Parameters:** |
|
|
- `song_name` (required): Name of the song to play |
|
|
- `artist` (optional): Artist name |
|
|
- `album` (optional): Album name |
|
|
|
|
|
**Examples:** |
|
|
- "Play Bohemian Rhapsody" |
|
|
- "Play Imagine by John Lennon" |
|
|
- "I want to hear Wonderwall" |
|
|
|
|
|
### 2. playback_control |
|
|
Control music playback (pause, resume, skip). |
|
|
|
|
|
**Parameters:** |
|
|
- `action` (required): One of: play, pause, skip, next, previous, stop, resume |
|
|
|
|
|
**Examples:** |
|
|
- "Pause" |
|
|
- "Resume" |
|
|
- "Skip to next song" |
|
|
- "Stop the music" |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
from peft import PeftModel |
|
|
import torch |
|
|
|
|
|
# Load base model and tokenizer |
|
|
base_model = AutoModelForCausalLM.from_pretrained( |
|
|
"google/functiongemma-270m-it", |
|
|
torch_dtype=torch.float16, |
|
|
device_map="auto" |
|
|
) |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained("google/functiongemma-270m-it") |
|
|
|
|
|
# Load LoRA adapters |
|
|
model = PeftModel.from_pretrained(base_model, "your-username/music-2func") |
|
|
|
|
|
# Define functions |
|
|
FUNCTIONS = [ |
|
|
{ |
|
|
"type": "function", |
|
|
"function": { |
|
|
"name": "play_song", |
|
|
"description": "Play a specific song by name or artist", |
|
|
"parameters": { |
|
|
"type": "object", |
|
|
"properties": { |
|
|
"song_name": {"type": "string", "description": "Name of the song to play"}, |
|
|
"artist": {"type": "string", "description": "Artist name (optional)"}, |
|
|
"album": {"type": "string", "description": "Album name (optional)"} |
|
|
}, |
|
|
"required": ["song_name"] |
|
|
} |
|
|
} |
|
|
}, |
|
|
{ |
|
|
"type": "function", |
|
|
"function": { |
|
|
"name": "playback_control", |
|
|
"description": "Control music playback", |
|
|
"parameters": { |
|
|
"type": "object", |
|
|
"properties": { |
|
|
"action": { |
|
|
"type": "string", |
|
|
"enum": ["play", "pause", "skip", "next", "previous", "stop", "resume"] |
|
|
} |
|
|
}, |
|
|
"required": ["action"] |
|
|
} |
|
|
} |
|
|
} |
|
|
] |
|
|
|
|
|
# Generate function call |
|
|
user_input = "Play Bohemian Rhapsody" |
|
|
|
|
|
messages = [{"role": "user", "content": user_input}] |
|
|
|
|
|
prompt = tokenizer.apply_chat_template( |
|
|
messages, |
|
|
tools=FUNCTIONS, |
|
|
add_generation_prompt=True, |
|
|
tokenize=False |
|
|
) |
|
|
|
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
|
|
|
with torch.no_grad(): |
|
|
outputs = model.generate( |
|
|
**inputs, |
|
|
max_new_tokens=128, |
|
|
temperature=0.1, |
|
|
do_sample=True, |
|
|
pad_token_id=tokenizer.eos_token_id |
|
|
) |
|
|
|
|
|
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=False) |
|
|
print(response) |
|
|
``` |
|
|
|
|
|
**Expected Output:** |
|
|
``` |
|
|
<start_function_call>call:play_song{song_name:<escape>Bohemian Rhapsody<escape>}<end_function_call> |
|
|
``` |
|
|
|
|
|
## Test Results |
|
|
|
|
|
| Test Input | Expected Function | Result | |
|
|
|-----------|------------------|--------| |
|
|
| "Play Bohemian Rhapsody" | `play_song` | ✅ Pass | |
|
|
| "Pause the music" | `playback_control` | ✅ Pass | |
|
|
| "Skip to next song" | `playback_control` | ✅ Pass | |
|
|
| "Play Wonderwall" | `play_song` | ✅ Pass | |
|
|
| "Resume" | `playback_control` | ✅ Pass | |
|
|
|
|
|
**Success Rate: 100% (5/5 tests)** |
|
|
|
|
|
## Training Methodology |
|
|
|
|
|
This model was trained using a **gradual scaling approach** to avoid cognitive overload: |
|
|
|
|
|
1. Started with 2 functions (play_song, playback_control) |
|
|
2. 30 examples per function covering diverse phrasings |
|
|
3. Correct format: Pass dict directly to `apply_chat_template` (NOT `json.dumps()`) |
|
|
|
|
|
### Key Learnings |
|
|
|
|
|
1. **Critical Bug Fixed**: Must pass arguments as dict, not `json.dumps(arguments)` |
|
|
2. **Cognitive Overload**: Training with 18 functions failed (0% accuracy), but 2 functions achieved 100% |
|
|
3. **Gradual Scaling**: Recommended path is 2→4→8→18 functions |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Only supports 2 functions (play_song and playback_control) |
|
|
- Trained on English language only |
|
|
- Best performance with clear, direct commands |
|
|
- **Not compatible with Ollama** (Ollama doesn't support FunctionGemma's dynamic tool schema) |
|
|
|
|
|
## Future Work |
|
|
|
|
|
- Scale to 4 functions (add search_music, create_playlist) |
|
|
- Scale to 8 functions (add volume control, queue management) |
|
|
- Eventually scale to full 18-function music system |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@misc{music-2func-2026, |
|
|
title={FunctionGemma Music Assistant (2-Function)}, |
|
|
author={Your Name}, |
|
|
year={2026}, |
|
|
publisher={HuggingFace}, |
|
|
howpublished={\url{https://huggingface.co/your-username/music-2func}} |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
Apache 2.0 (inherited from base model) |
|
|
|
|
|
## Base Model |
|
|
|
|
|
This model is based on [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it). |
|
|
|