Kirha Planner
A fine-tuned Qwen3-8B model that generates complete tool execution plans from natural language queries in a single pass.
Why Tool Planning?
Traditional agentic systems let the LLM call tools iteratively until it finds the right data. This leads to unpredictable costs, high latency, and bloated context windows.
Tool Planning takes a different approach: analyze the request, build a complete execution graph, and only then execute. One LLM call produces the entire DAG β no back-and-forth.
This unlocks:
- Cost predictability: the plan is known before execution, costs can be estimated upfront
- Parallel execution: independent steps run concurrently, dependent steps chain automatically
- Audit trail: every plan includes reasoning traces and a complete execution log
Composition with Output Schemas
Tool Planning relies on outputSchema, a capability now part of the official MCP specification. With output schemas, the planner knows what each tool returns before execution and can chain tools by matching output fields to input parameters. No exploratory calls needed.
Output format
Given a set of tools and a user query, the model outputs:
<think>
I need to find the largest USDC holder on Base, then get their PnL.
First I need the chain ID for Base and the USDC contract address...
</think>
<plan>
[
{
"thought": "Get the chain ID for Base",
"toolName": "getChainId",
"arguments": { "blockchain": "Base" }
},
{
"thought": "Search for the USDC coin",
"toolName": "searchCoin",
"arguments": { "query": "USDC", "limit": 1 }
},
{
"thought": "Get the USDC contract address on Base",
"toolName": "getCoinPlatformInfo",
"arguments": {
"coinId": "{1.coins.0.id}",
"platform": "base"
}
},
{
"thought": "Get the top holder of USDC on Base",
"toolName": "getTokenHolders",
"arguments": {
"chainId": "{0.chainId}",
"tokenAddress": "{2.contractAddress}",
"limit": 1
}
},
{
"thought": "Get the PnL for this wallet",
"toolName": "getWalletPnL",
"arguments": {
"address": "{3.holders.0.address}"
}
}
]
</plan>
Dependency references
Steps reference previous outputs using template strings:
{0.chainId}β fieldchainIdfrom step 0{1.coins.0.id}β nested path from step 1{3.holders.0.address}β array access from step 3
Execution graph
Step 0: getChainId Step 1: searchCoin
β β
β β
β Step 2: getCoinPlatformInfo
β β
ββββββββββββββ¬βββββββββββββ
β
Step 3: getTokenHolders
β
β
Step 4: getWalletPnL
Steps 0 and 1 run in parallel. Step 3 waits for both 0 and 2. The executor resolves references at runtime.
Usage
Use the @kirha/planner TypeScript SDK to interact with this model. It handles parsing, validation, dependency resolution, and parallel execution of plans.
npm install @kirha/planner
Tools can come from any source, including MCP servers that have outputSchema defined.
See the SDK documentation for full usage examples.
Training
- Base model: Qwen/Qwen3-8B
- Method: QLoRA (4-bit NF4, LoRA r=64, alpha=128)
- Format: Alpaca (instruction/input/output)
- Dataset: kirha/planner-dataset
The model learns planning methodology rather than memorizing specific APIs, enabling zero-shot generalization to any tool catalog at inference time. Noise tools are injected during training to teach discrimination between relevant and irrelevant options.
Links
- Downloads last month
- 87