|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: google/functiongemma-270m-it |
|
|
tags: |
|
|
- function-calling |
|
|
- tool-use |
|
|
- dispatcher |
|
|
- delia |
|
|
- gemma |
|
|
language: |
|
|
- en |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# FunctionGemma 270M - Delia Dispatcher |
|
|
|
|
|
A fine-tuned version of [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it) for **Delia LLM orchestration**. |
|
|
|
|
|
This tiny model (270M params) acts as a fast dispatcher, routing user requests to the appropriate backend: |
|
|
- `call_coder` - Code generation tasks |
|
|
- `call_reviewer` - Code review and analysis |
|
|
- `call_planner` - Architecture and planning (also handles ambiguous requests) |
|
|
- `call_executor` - Running commands and scripts |
|
|
|
|
|
## Key Features |
|
|
|
|
|
- **Minimalist schema**: Single `reasoning` parameter per tool |
|
|
- **Thought tokens**: Brief CoT scratchpad before tool calls |
|
|
- **EOS hardening**: Explicit stop tokens prevent infinite loops |
|
|
- **Negative samples**: 13% ambiguous examples → planner (graceful handling) |
|
|
- **GBNF grammar**: Constrained decoding for 100% valid output |
|
|
|
|
|
## Usage |
|
|
|
|
|
### With llama.cpp (recommended for speed) |
|
|
|
|
|
```bash |
|
|
# Download the GGUF |
|
|
wget https://huggingface.co/devopsforflops/functiongemma-270m-delia-dispatcher/resolve/main/functiongemma-270m-delia-dispatcher-f16.gguf |
|
|
|
|
|
# Download the grammar |
|
|
wget https://huggingface.co/devopsforflops/functiongemma-270m-delia-dispatcher/resolve/main/dispatcher.gbnf |
|
|
|
|
|
# Run with grammar constraint |
|
|
./llama-cli -m functiongemma-270m-delia-dispatcher-f16.gguf \ |
|
|
--grammar-file dispatcher.gbnf \ |
|
|
-p "<start_of_turn>user |
|
|
Write a fibonacci function<end_of_turn> |
|
|
<start_of_turn>model" |
|
|
``` |
|
|
|
|
|
### With Transformers |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
|
|
model = AutoModelForCausalLM.from_pretrained("devopsforflops/functiongemma-270m-delia-dispatcher") |
|
|
tokenizer = AutoTokenizer.from_pretrained("devopsforflops/functiongemma-270m-delia-dispatcher") |
|
|
|
|
|
prompt = """<start_of_turn>user |
|
|
Review this code for bugs<end_of_turn> |
|
|
<start_of_turn>model""" |
|
|
|
|
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
|
outputs = model.generate(**inputs, max_new_tokens=100) |
|
|
print(tokenizer.decode(outputs[0])) |
|
|
``` |
|
|
|
|
|
## Output Format |
|
|
|
|
|
``` |
|
|
<start_of_turn>user |
|
|
{request}<end_of_turn> |
|
|
<start_of_turn>model |
|
|
thought |
|
|
{brief reasoning} |
|
|
<tool_call>{"name": "call_X", "arguments": {"reasoning": "..."}}</tool_call><end_of_turn> |
|
|
``` |
|
|
|
|
|
## Training |
|
|
|
|
|
Fine-tuned with [Unsloth](https://github.com/unslothai/unsloth) using LoRA: |
|
|
- **Epochs**: 3 |
|
|
- **LoRA rank**: 32 |
|
|
- **Training examples**: 92 (balanced across 4 tools + 13% ambiguous) |
|
|
- **Final loss**: 0.46 |
|
|
|
|
|
## Files |
|
|
|
|
|
| File | Description | |
|
|
|------|-------------| |
|
|
| `functiongemma-270m-delia-dispatcher-f16.gguf` | GGUF model (F16, 518MB) | |
|
|
| `model.safetensors` | Transformers model | |
|
|
| `dispatcher.gbnf` | GBNF grammar for constrained decoding | |
|
|
| `dispatcher_tools.json` | Tool schema (4 tools) | |
|
|
| `train.jsonl` | Training data | |
|
|
|
|
|
## License |
|
|
|
|
|
Apache 2.0 (same as base model) |
|
|
|
|
|
## Part of Delia |
|
|
|
|
|
This model is designed for use with [Delia](https://github.com/zbrdc/delia), an LLM orchestration system that routes requests to optimal backends. |
|
|
|