mach-kernel
/

ecu-pilot-fp16

@@ -6,53 +6,58 @@ tags:
   - fine-tuned
   - tool-calling
   - mcp
 ---
-# ecu-pilot
-A fine-tuned Qwen3.5-35B-A3B for tool calling against a certain popular data transformation framework that shall remain unnamed (but whose initials, shifted back by one letter, might ring a bell).
-This model was trained to call 9 MCP tools that query project metadata — things like lineage traversal, impact analysis, test coverage, and schema introspection. It knows when to call them, what arguments to pass, and how to synthesize the results into something a data engineer would actually find helpful.
-## What it does
-You ask it about your project. It thinks about which tools to use. It calls them. It gives you an answer.
-```
-> "What's the blast radius of changing stg_orders?"
-<think>
-Goal: pre-refactor impact analysis
-Tools: node, impact, report
-</think>
-→ calls 3 tools
-→ "Affects 3 downstream models. Orders has 5 tests including financial validation."
 ```
-## Training details
-- **Base model**: Qwen3.5-35B-A3B-Base (MoE — 35B total, 3B active per token)
-- **Method**: bf16 LoRA (r=16, alpha=16) — not QLoRA, because MoE expert routing deserves respect
-- **Curriculum**: Two-stage SFT adapted from the Thinkquel methodology
-  - Stage 1: tool-calling mechanics (1 epoch, 1,206 examples)
-  - Stage 2: structured planning (2 epochs, 290 examples)
-- **Hardware**: NVIDIA H200 (141 GB). One GPU. One hour.
-- **Training data**: 1,206 examples with real tool responses from a real project index. Nothing hallucinated.
-## Usage
-```python
-from transformers import AutoModelForCausalLM, AutoTokenizer
-model = AutoModelForCausalLM.from_pretrained(mach-kernel/ecu-pilot-fp16, torch_dtype=bfloat16)
-tokenizer = AutoTokenizer.from_pretrained(mach-kernel/ecu-pilot-fp16)
-```
-## LoRA adapter
-If you'd rather merge it yourself: [mach-kernel/ecu-pilot-fp16-lora](https://huggingface.co/mach-kernel/ecu-pilot-fp16-lora)
 ## Why "ecu"
-No reason. Just liked how it sounded. Definitely not a Caesar cipher of anything. Don't look into it.

   - fine-tuned
   - tool-calling
   - mcp
+  - dbt
 ---
+# ecu-pilot (FP16)
+Fine-tuned [Qwen3.5-35B-A3B-Base](https://huggingface.co/Qwen/Qwen3.5-35B-A3B-Base) for structured tool calling against project metadata via MCP.
+Trained to accurately call 9 tools — lineage traversal, impact analysis, test coverage reporting, schema introspection, search, and more — with valid arguments and well-synthesized answers grounded in real tool output.
+## Model details
+| | |
+|---|---|
+| **Base model** | Qwen3.5-35B-A3B-Base |
+| **Architecture** | Mixture of Experts (35B total, 3B active per token) |
+| **Fine-tuning method** | bf16 LoRA (r=16, alpha=16) |
+| **Training stages** | Stage 1: tool mechanics (1 epoch, 1,206 examples) / Stage 2: structured planning (2 epochs, 290 examples) |
+| **Hardware** | NVIDIA H200 141GB, ~1 hour total |
+| **Training data** | 1,206 ChatML examples with real tool responses from indexed project metadata |
+## Usage
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+model = AutoModelForCausalLM.from_pretrained(
+    "mach-kernel/ecu-pilot-fp16",
+    torch_dtype=torch.bfloat16,
+    device_map="auto",
+)
+tokenizer = AutoTokenizer.from_pretrained("mach-kernel/ecu-pilot-fp16")
 ```
+## Quantized variants
+| Format | Repository |
+|--------|-----------|
+| FP16 (this repo) | [mach-kernel/ecu-pilot-fp16](https://huggingface.co/mach-kernel/ecu-pilot-fp16) |
+| LoRA adapter only | [mach-kernel/ecu-pilot-fp16-lora](https://huggingface.co/mach-kernel/ecu-pilot-fp16-lora) |
+| GGUF Q4_K_M | [mach-kernel/ecu-pilot-q4km](https://huggingface.co/mach-kernel/ecu-pilot-q4km) |
+| GGUF Q8_0 | [mach-kernel/ecu-pilot-q8_0](https://huggingface.co/mach-kernel/ecu-pilot-q8_0) |
+## Training methodology
+Two-stage supervised fine-tuning adapted from the [Thinkquel](https://arxiv.org/abs/2510.00186) methodology:
+1. **Stage 1 — Tool mechanics**: Teaches the model what tools exist, how to format calls, and how to interpret responses.
+2. **Stage 2 — Structured planning**: Teaches the model to reason about *when* and *why* to call tools using `<think>` blocks before acting.
+All training examples use real tool responses from an indexed project — no synthetic or hallucinated tool output.
 ## Why "ecu"
+No particular reason. Just liked the sound of it.