--- license: apache-2.0 base_model: google/functiongemma-270m-it tags: - function-calling - tool-use - dispatcher - delia - gemma language: - en pipeline_tag: text-generation --- # FunctionGemma 270M - Delia Dispatcher A fine-tuned version of [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it) for **Delia LLM orchestration**. This tiny model (270M params) acts as a fast dispatcher, routing user requests to the appropriate backend: - `call_coder` - Code generation tasks - `call_reviewer` - Code review and analysis - `call_planner` - Architecture and planning (also handles ambiguous requests) - `call_executor` - Running commands and scripts ## Key Features - **Minimalist schema**: Single `reasoning` parameter per tool - **Thought tokens**: Brief CoT scratchpad before tool calls - **EOS hardening**: Explicit stop tokens prevent infinite loops - **Negative samples**: 13% ambiguous examples → planner (graceful handling) - **GBNF grammar**: Constrained decoding for 100% valid output ## Usage ### With llama.cpp (recommended for speed) ```bash # Download the GGUF wget https://huggingface.co/devopsforflops/functiongemma-270m-delia-dispatcher/resolve/main/functiongemma-270m-delia-dispatcher-f16.gguf # Download the grammar wget https://huggingface.co/devopsforflops/functiongemma-270m-delia-dispatcher/resolve/main/dispatcher.gbnf # Run with grammar constraint ./llama-cli -m functiongemma-270m-delia-dispatcher-f16.gguf \ --grammar-file dispatcher.gbnf \ -p "user Write a fibonacci function model" ``` ### With Transformers ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("devopsforflops/functiongemma-270m-delia-dispatcher") tokenizer = AutoTokenizer.from_pretrained("devopsforflops/functiongemma-270m-delia-dispatcher") prompt = """user Review this code for bugs model""" inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=100) print(tokenizer.decode(outputs[0])) ``` ## Output Format ``` user {request} model thought {brief reasoning} {"name": "call_X", "arguments": {"reasoning": "..."}} ``` ## Training Fine-tuned with [Unsloth](https://github.com/unslothai/unsloth) using LoRA: - **Epochs**: 3 - **LoRA rank**: 32 - **Training examples**: 92 (balanced across 4 tools + 13% ambiguous) - **Final loss**: 0.46 ## Files | File | Description | |------|-------------| | `functiongemma-270m-delia-dispatcher-f16.gguf` | GGUF model (F16, 518MB) | | `model.safetensors` | Transformers model | | `dispatcher.gbnf` | GBNF grammar for constrained decoding | | `dispatcher_tools.json` | Tool schema (4 tools) | | `train.jsonl` | Training data | ## License Apache 2.0 (same as base model) ## Part of Delia This model is designed for use with [Delia](https://github.com/zbrdc/delia), an LLM orchestration system that routes requests to optimal backends.