Kirha Planner

A fine-tuned Qwen3-8B model that generates complete tool execution plans from natural language queries in a single pass.

Why Tool Planning?

Traditional agentic systems let the LLM call tools iteratively until it finds the right data. This leads to unpredictable costs, high latency, and bloated context windows.

Tool Planning takes a different approach: analyze the request, build a complete execution graph, and only then execute. One LLM call produces the entire DAG — no back-and-forth.

This unlocks:

Cost predictability: the plan is known before execution, costs can be estimated upfront
Parallel execution: independent steps run concurrently, dependent steps chain automatically
Audit trail: every plan includes reasoning traces and a complete execution log

Composition with Output Schemas

Tool Planning relies on outputSchema, a capability now part of the official MCP specification. With output schemas, the planner knows what each tool returns before execution and can chain tools by matching output fields to input parameters. No exploratory calls needed.

Output format

Given a set of tools and a user query, the model outputs:

<think>
I need to find the largest USDC holder on Base, then get their PnL.
First I need the chain ID for Base and the USDC contract address...
</think>
<plan>
[
  {
    "thought": "Get the chain ID for Base",
    "toolName": "getChainId",
    "arguments": { "blockchain": "Base" }
  },
  {
    "thought": "Search for the USDC coin",
    "toolName": "searchCoin",
    "arguments": { "query": "USDC", "limit": 1 }
  },
  {
    "thought": "Get the USDC contract address on Base",
    "toolName": "getCoinPlatformInfo",
    "arguments": {
      "coinId": "{1.coins.0.id}",
      "platform": "base"
    }
  },
  {
    "thought": "Get the top holder of USDC on Base",
    "toolName": "getTokenHolders",
    "arguments": {
      "chainId": "{0.chainId}",
      "tokenAddress": "{2.contractAddress}",
      "limit": 1
    }
  },
  {
    "thought": "Get the PnL for this wallet",
    "toolName": "getWalletPnL",
    "arguments": {
      "address": "{3.holders.0.address}"
    }
  }
]
</plan>

Dependency references

Steps reference previous outputs using template strings:

{0.chainId} — field chainId from step 0
{1.coins.0.id} — nested path from step 1
{3.holders.0.address} — array access from step 3

Execution graph

Step 0: getChainId        Step 1: searchCoin
    │                         │
    │                         ↓
    │                     Step 2: getCoinPlatformInfo
    │                         │
    └────────────┬────────────┘
                 ↓
         Step 3: getTokenHolders
                 │
                 ↓
         Step 4: getWalletPnL

Steps 0 and 1 run in parallel. Step 3 waits for both 0 and 2. The executor resolves references at runtime.

Usage

Use the @kirha/planner TypeScript SDK to interact with this model. It handles parsing, validation, dependency resolution, and parallel execution of plans.

npm install @kirha/planner

Tools can come from any source, including MCP servers that have outputSchema defined.

See the SDK documentation for full usage examples.

Training

Base model: Qwen/Qwen3-8B
Method: QLoRA (4-bit NF4, LoRA r=64, alpha=128)
Format: Alpaca (instruction/input/output)
Dataset: kirha/planner-dataset

The model learns planning methodology rather than memorizing specific APIs, enabling zero-shot generalization to any tool catalog at inference time. Noise tools are injected during training to teach discrimination between relevant and irrelevant options.