File size: 2,504 Bytes
1dff081
 
 
 
 
 
 
 
55f3189
1dff081
 
55f3189
1dff081
55f3189
1dff081
55f3189
1dff081
55f3189
1dff081
55f3189
 
 
 
 
 
 
 
1dff081
55f3189
1dff081
55f3189
 
 
 
 
 
 
 
 
 
1dff081
 
55f3189
1dff081
55f3189
 
 
 
 
 
1dff081
55f3189
1dff081
55f3189
1dff081
55f3189
 
1dff081
55f3189
1dff081
 
 
55f3189
90fc835
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
---
license: apache-2.0
base_model: Qwen/Qwen3.5-35B-A3B-Base
tags:
  - lora
  - fine-tuned
  - tool-calling
  - mcp
  - dbt
---

# ecu-pilot (FP16)

Fine-tuned [Qwen3.5-35B-A3B-Base](https://huggingface.co/Qwen/Qwen3.5-35B-A3B-Base) for structured tool calling against project metadata via MCP.

Trained to accurately call 9 tools — lineage traversal, impact analysis, test coverage reporting, schema introspection, search, and more — with valid arguments and well-synthesized answers grounded in real tool output.

## Model details

| | |
|---|---|
| **Base model** | Qwen3.5-35B-A3B-Base |
| **Architecture** | Mixture of Experts (35B total, 3B active per token) |
| **Fine-tuning method** | bf16 LoRA (r=16, alpha=16) |
| **Training stages** | Stage 1: tool mechanics (1 epoch, 1,206 examples) / Stage 2: structured planning (2 epochs, 290 examples) |
| **Hardware** | NVIDIA H200 141GB, ~1 hour total |
| **Training data** | 1,206 ChatML examples with real tool responses from indexed project metadata |

## Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "mach-kernel/ecu-pilot-fp16",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("mach-kernel/ecu-pilot-fp16")
```

## Quantized variants

| Format | Repository |
|--------|-----------|
| FP16 (this repo) | [mach-kernel/ecu-pilot-fp16](https://huggingface.co/mach-kernel/ecu-pilot-fp16) |
| LoRA adapter only | [mach-kernel/ecu-pilot-fp16-lora](https://huggingface.co/mach-kernel/ecu-pilot-fp16-lora) |
| GGUF Q4_K_M | [mach-kernel/ecu-pilot-q4km](https://huggingface.co/mach-kernel/ecu-pilot-q4km) |
| GGUF Q8_0 | [mach-kernel/ecu-pilot-q8_0](https://huggingface.co/mach-kernel/ecu-pilot-q8_0) |

## Training methodology

Two-stage supervised fine-tuning adapted from the [Thinkquel](https://arxiv.org/abs/2510.00186) methodology:

1. **Stage 1 — Tool mechanics**: Teaches the model what tools exist, how to format calls, and how to interpret responses.
2. **Stage 2 — Structured planning**: Teaches the model to reason about *when* and *why* to call tools using `<think>` blocks before acting.

All training examples use real tool responses from an indexed project — no synthetic or hallucinated tool output.

## Why "ecu"

No particular reason. Just liked the sound of it.

## Why ecu

No reason. Just liked how it sounded. Definitely not a Caesar cipher of anything. Don't look into it.