ToolTune
Tiny specialist models for tool-call validation, repair, and routing โ learned from real agent conversations.
What is ToolTune?
Local models are great at reasoning but terrible at tool calling. They hallucinate parameter names, pick wrong tools, and break JSON schemas. ToolTune fixes this with a two-layer approach:
- Layer 1 (Deterministic): Schema validation, type coercion, fuzzy name matching, auto-repair. <1ms, no GPU needed.
- Layer 2 (Semantic): ML-based validation โ plausible values, sensible for query, confidence scoring. ~50ms on GPU.
Training
- Data: 86,955 clean records from 6 sources (289,850 after function masking)
- Method: SFT + GRPO + DPO (3-stage pipeline, inspired by Hammer)
- Training: BF16 + LoRA (rank=32, alpha=64)
- Base Models: Qwen 2.5 Coder 1.5B / 3B / 7B Instruct
Usage
from tooltune import ToolCallValidator, validate_tool_call
tools = [
{"name": "exec", "description": "Run command", "parameters": {
"type": "object",
"properties": {"command": {"type": "string"}},
"required": ["command"]
}},
]
result = validate_tool_call(
tool_call={"name": "exec", "arguments": {"command": "ls"}},
available_tools=tools,
)
print(result.valid) # True
print(result.confidence) # 1.0
Links
- GitHub: starbuck100/tooltune
- Dataset: cryptobuck/tooltune-dataset
License
Apache 2.0
Model tree for cryptobuck/tooltune
Base model
Qwen/Qwen2.5-1.5B
Finetuned
Qwen/Qwen2.5-Coder-1.5B
Finetuned
Qwen/Qwen2.5-Coder-1.5B-Instruct