devopsforflops commited on
Commit
028f1a8
·
verified ·
1 Parent(s): 52dd858

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +104 -0
README.md ADDED
@@ -0,0 +1,104 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: google/functiongemma-270m-it
4
+ tags:
5
+ - function-calling
6
+ - tool-use
7
+ - dispatcher
8
+ - delia
9
+ - gemma
10
+ language:
11
+ - en
12
+ pipeline_tag: text-generation
13
+ ---
14
+
15
+ # FunctionGemma 270M - Delia Dispatcher
16
+
17
+ A fine-tuned version of [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it) for **Delia LLM orchestration**.
18
+
19
+ This tiny model (270M params) acts as a fast dispatcher, routing user requests to the appropriate backend:
20
+ - `call_coder` - Code generation tasks
21
+ - `call_reviewer` - Code review and analysis
22
+ - `call_planner` - Architecture and planning (also handles ambiguous requests)
23
+ - `call_executor` - Running commands and scripts
24
+
25
+ ## Key Features
26
+
27
+ - **Minimalist schema**: Single `reasoning` parameter per tool
28
+ - **Thought tokens**: Brief CoT scratchpad before tool calls
29
+ - **EOS hardening**: Explicit stop tokens prevent infinite loops
30
+ - **Negative samples**: 13% ambiguous examples → planner (graceful handling)
31
+ - **GBNF grammar**: Constrained decoding for 100% valid output
32
+
33
+ ## Usage
34
+
35
+ ### With llama.cpp (recommended for speed)
36
+
37
+ ```bash
38
+ # Download the GGUF
39
+ wget https://huggingface.co/devopsforflops/functiongemma-270m-delia-dispatcher/resolve/main/functiongemma-270m-delia-dispatcher-f16.gguf
40
+
41
+ # Download the grammar
42
+ wget https://huggingface.co/devopsforflops/functiongemma-270m-delia-dispatcher/resolve/main/dispatcher.gbnf
43
+
44
+ # Run with grammar constraint
45
+ ./llama-cli -m functiongemma-270m-delia-dispatcher-f16.gguf \
46
+ --grammar-file dispatcher.gbnf \
47
+ -p "<start_of_turn>user
48
+ Write a fibonacci function<end_of_turn>
49
+ <start_of_turn>model"
50
+ ```
51
+
52
+ ### With Transformers
53
+
54
+ ```python
55
+ from transformers import AutoModelForCausalLM, AutoTokenizer
56
+
57
+ model = AutoModelForCausalLM.from_pretrained("devopsforflops/functiongemma-270m-delia-dispatcher")
58
+ tokenizer = AutoTokenizer.from_pretrained("devopsforflops/functiongemma-270m-delia-dispatcher")
59
+
60
+ prompt = """<start_of_turn>user
61
+ Review this code for bugs<end_of_turn>
62
+ <start_of_turn>model"""
63
+
64
+ inputs = tokenizer(prompt, return_tensors="pt")
65
+ outputs = model.generate(**inputs, max_new_tokens=100)
66
+ print(tokenizer.decode(outputs[0]))
67
+ ```
68
+
69
+ ## Output Format
70
+
71
+ ```
72
+ <start_of_turn>user
73
+ {request}<end_of_turn>
74
+ <start_of_turn>model
75
+ thought
76
+ {brief reasoning}
77
+ <tool_call>{"name": "call_X", "arguments": {"reasoning": "..."}}</tool_call><end_of_turn>
78
+ ```
79
+
80
+ ## Training
81
+
82
+ Fine-tuned with [Unsloth](https://github.com/unslothai/unsloth) using LoRA:
83
+ - **Epochs**: 3
84
+ - **LoRA rank**: 32
85
+ - **Training examples**: 92 (balanced across 4 tools + 13% ambiguous)
86
+ - **Final loss**: 0.46
87
+
88
+ ## Files
89
+
90
+ | File | Description |
91
+ |------|-------------|
92
+ | `functiongemma-270m-delia-dispatcher-f16.gguf` | GGUF model (F16, 518MB) |
93
+ | `model.safetensors` | Transformers model |
94
+ | `dispatcher.gbnf` | GBNF grammar for constrained decoding |
95
+ | `dispatcher_tools.json` | Tool schema (4 tools) |
96
+ | `train.jsonl` | Training data |
97
+
98
+ ## License
99
+
100
+ Apache 2.0 (same as base model)
101
+
102
+ ## Part of Delia
103
+
104
+ This model is designed for use with [Delia](https://github.com/zbrdc/delia), an LLM orchestration system that routes requests to optimal backends.