--- license: gemma base_model: google/gemma-4-e4b-it tags: - gguf - llama-cpp - ollama - 3d-printing - chief-engineer - microfactory language: - en --- # Microfactory Node — Chief Engineer (GGUF) Quantized GGUFs of three LoRA-fine-tuned variants of [`google/gemma-4-e4b-it`](https://huggingface.co/google/gemma-4-e4b-it), trained on real 3D-printer outcomes to predict where a print will fail and propose settings before the nozzle moves. Both distribution paths point at the same blobs: - **`ollama.com/kylebrodeur`** — public Ollama registry, one-command pulls - **`huggingface.co/kylebrodeur/microfactory-node-gguf`** *(this repo)* — canonical GGUFs + `template`/`system`/`params` config | File | Quant | Size | `ollama run …` (registry tag) | Source adapter | |------|-------|------|-------------------------------|----------------| | **`microfactory-node-v3-qat.gguf`** | q4_k_m | 5.1 GB | [`kylebrodeur/microfactory-node-v3-qat`](https://ollama.com/kylebrodeur/microfactory-node-v3-qat) *(recommended)* | [`microfactory-node-lora-v3-qat`](https://huggingface.co/kylebrodeur/microfactory-node-lora-v3-qat) | | `microfactory-node-v3-qat-q4_0.gguf` | q4_0 | 4.9 GB | [`kylebrodeur/microfactory-node-v3-qat:q4_0`](https://ollama.com/kylebrodeur/microfactory-node-v3-qat:q4_0) | [`microfactory-node-lora-v3-qat`](https://huggingface.co/kylebrodeur/microfactory-node-lora-v3-qat) | | `microfactory-node-v2.gguf` | q4_k_m | 5.1 GB | [`kylebrodeur/microfactory-node-v2`](https://ollama.com/kylebrodeur/microfactory-node-v2) | [`microfactory-node-lora-v2`](https://huggingface.co/kylebrodeur/microfactory-node-lora-v2) | | `microfactory-node.gguf` | q4_k_m | 5.1 GB | [`kylebrodeur/microfactory-node`](https://ollama.com/kylebrodeur/microfactory-node) | [`microfactory-node-lora`](https://huggingface.co/kylebrodeur/microfactory-node-lora) | > The QAT model was trained with simulated 4-bit quantization, so it retains > more quality after quantization than the standard v2. Use `q4_k_m` for > balanced quality/size, or `q4_0` (the quant Google's QAT was trained for) > for the highest fidelity reconstruction of the QAT model. ## Run with Ollama (public registry — easiest) ```bash # recommended ollama run kylebrodeur/microfactory-node-v3-qat # QAT-native quant ollama run kylebrodeur/microfactory-node-v3-qat:q4_0 # other variants ollama run kylebrodeur/microfactory-node-v2 ollama run kylebrodeur/microfactory-node ``` ## Run with Ollama (this HF repo — no download step) Ollama can pull GGUFs directly from HF — the `template`, `system`, and `params` files in this repo configure the Gemma 4 chat template, the Chief Engineer system prompt, and tuned sampling automatically: ```bash ollama run hf.co/kylebrodeur/microfactory-node-gguf:microfactory-node-v3-qat.gguf ollama run hf.co/kylebrodeur/microfactory-node-gguf:microfactory-node-v3-qat-q4_0.gguf ollama run hf.co/kylebrodeur/microfactory-node-gguf:microfactory-node-v2.gguf ollama run hf.co/kylebrodeur/microfactory-node-gguf:microfactory-node.gguf ``` See the [HF × Ollama docs](https://huggingface.co/docs/hub/en/ollama) for the `hf.co/...` URI form and how Ollama discovers the auxiliary config files. ## Run with llama.cpp ```bash hf download kylebrodeur/microfactory-node-gguf microfactory-node-v3-qat.gguf --local-dir . llama-cli -m microfactory-node-v3-qat.gguf -p "PLA overhang at 22C, 45% humidity" ``` ## Use the live demo The Hugging Face Space [`build-small-hackathon/microfactory-lab`](https://huggingface.co/spaces/build-small-hackathon/microfactory-lab) runs the full Chief Engineer UI against these adapters (ZeroGPU + a Modal-hosted OpenAI-compatible endpoint as fallback). Source repo: [`kylebrodeur/microfactory-lab`](https://github.com/kylebrodeur/microfactory-lab). The full conversion + publishing pipeline (LoRA → Modal merge → llama.cpp quantize → HF Hub → ollama.com) is documented in [`learn/finetune/OLLAMA_PUBLISHING.md`](https://github.com/kylebrodeur/microfactory-lab/blob/main/chief-engineer/learn/finetune/OLLAMA_PUBLISHING.md).