---
license: gemma
base_model: google/gemma-4-e4b-it
tags:
  - gguf
  - llama-cpp
  - ollama
  - 3d-printing
  - chief-engineer
  - microfactory
language:
  - en
---

# Microfactory Node — Chief Engineer (GGUF)

Quantized GGUFs of three LoRA-fine-tuned variants of
[`google/gemma-4-e4b-it`](https://huggingface.co/google/gemma-4-e4b-it), trained
on real 3D-printer outcomes to predict where a print will fail and propose
settings before the nozzle moves.

Both distribution paths point at the same blobs:
- **`ollama.com/kylebrodeur`** — public Ollama registry, one-command pulls
- **`huggingface.co/kylebrodeur/microfactory-node-gguf`** *(this repo)* — canonical GGUFs + `template`/`system`/`params` config

| File | Quant | Size | `ollama run …` (registry tag) | Source adapter |
|------|-------|------|-------------------------------|----------------|
| **`microfactory-node-v3-qat.gguf`** | q4_k_m | 5.1 GB | [`kylebrodeur/microfactory-node-v3-qat`](https://ollama.com/kylebrodeur/microfactory-node-v3-qat) *(recommended)* | [`microfactory-node-lora-v3-qat`](https://huggingface.co/kylebrodeur/microfactory-node-lora-v3-qat) |
| `microfactory-node-v3-qat-q4_0.gguf` | q4_0 | 4.9 GB | [`kylebrodeur/microfactory-node-v3-qat:q4_0`](https://ollama.com/kylebrodeur/microfactory-node-v3-qat:q4_0) | [`microfactory-node-lora-v3-qat`](https://huggingface.co/kylebrodeur/microfactory-node-lora-v3-qat) |
| `microfactory-node-v2.gguf` | q4_k_m | 5.1 GB | [`kylebrodeur/microfactory-node-v2`](https://ollama.com/kylebrodeur/microfactory-node-v2) | [`microfactory-node-lora-v2`](https://huggingface.co/kylebrodeur/microfactory-node-lora-v2) |
| `microfactory-node.gguf` | q4_k_m | 5.1 GB | [`kylebrodeur/microfactory-node`](https://ollama.com/kylebrodeur/microfactory-node) | [`microfactory-node-lora`](https://huggingface.co/kylebrodeur/microfactory-node-lora) |

> The QAT model was trained with simulated 4-bit quantization, so it retains
> more quality after quantization than the standard v2. Use `q4_k_m` for
> balanced quality/size, or `q4_0` (the quant Google's QAT was trained for)
> for the highest fidelity reconstruction of the QAT model.

## Run with Ollama (public registry — easiest)

```bash
# recommended
ollama run kylebrodeur/microfactory-node-v3-qat

# QAT-native quant
ollama run kylebrodeur/microfactory-node-v3-qat:q4_0

# other variants
ollama run kylebrodeur/microfactory-node-v2
ollama run kylebrodeur/microfactory-node
```

## Run with Ollama (this HF repo — no download step)

Ollama can pull GGUFs directly from HF — the `template`, `system`, and `params`
files in this repo configure the Gemma 4 chat template, the Chief Engineer
system prompt, and tuned sampling automatically:

```bash
ollama run hf.co/kylebrodeur/microfactory-node-gguf:microfactory-node-v3-qat.gguf
ollama run hf.co/kylebrodeur/microfactory-node-gguf:microfactory-node-v3-qat-q4_0.gguf
ollama run hf.co/kylebrodeur/microfactory-node-gguf:microfactory-node-v2.gguf
ollama run hf.co/kylebrodeur/microfactory-node-gguf:microfactory-node.gguf
```

See the [HF × Ollama docs](https://huggingface.co/docs/hub/en/ollama) for the
`hf.co/...` URI form and how Ollama discovers the auxiliary config files.

## Run with llama.cpp

```bash
hf download kylebrodeur/microfactory-node-gguf microfactory-node-v3-qat.gguf --local-dir .
llama-cli -m microfactory-node-v3-qat.gguf -p "PLA overhang at 22C, 45% humidity"
```

## Use the live demo

The Hugging Face Space [`build-small-hackathon/microfactory-lab`](https://huggingface.co/spaces/build-small-hackathon/microfactory-lab)
runs the full Chief Engineer UI against these adapters (ZeroGPU + a Modal-hosted
OpenAI-compatible endpoint as fallback). Source repo:
[`kylebrodeur/microfactory-lab`](https://github.com/kylebrodeur/microfactory-lab).

The full conversion + publishing pipeline (LoRA → Modal merge → llama.cpp
quantize → HF Hub → ollama.com) is documented in
[`learn/finetune/OLLAMA_PUBLISHING.md`](https://github.com/kylebrodeur/microfactory-lab/blob/main/chief-engineer/learn/finetune/OLLAMA_PUBLISHING.md).