kylebrodeur's picture
docs: add q4_0 + ollama.com tags + cross-links
b61bd10 verified
|
Raw
History Blame Contribute Delete
4.07 kB
---
license: gemma
base_model: google/gemma-4-e4b-it
tags:
- gguf
- llama-cpp
- ollama
- 3d-printing
- chief-engineer
- microfactory
language:
- en
---
# Microfactory Node β€” Chief Engineer (GGUF)
Quantized GGUFs of three LoRA-fine-tuned variants of
[`google/gemma-4-e4b-it`](https://huggingface.co/google/gemma-4-e4b-it), trained
on real 3D-printer outcomes to predict where a print will fail and propose
settings before the nozzle moves.
Both distribution paths point at the same blobs:
- **`ollama.com/kylebrodeur`** β€” public Ollama registry, one-command pulls
- **`huggingface.co/kylebrodeur/microfactory-node-gguf`** *(this repo)* β€” canonical GGUFs + `template`/`system`/`params` config
| File | Quant | Size | `ollama run …` (registry tag) | Source adapter |
|------|-------|------|-------------------------------|----------------|
| **`microfactory-node-v3-qat.gguf`** | q4_k_m | 5.1 GB | [`kylebrodeur/microfactory-node-v3-qat`](https://ollama.com/kylebrodeur/microfactory-node-v3-qat) *(recommended)* | [`microfactory-node-lora-v3-qat`](https://huggingface.co/kylebrodeur/microfactory-node-lora-v3-qat) |
| `microfactory-node-v3-qat-q4_0.gguf` | q4_0 | 4.9 GB | [`kylebrodeur/microfactory-node-v3-qat:q4_0`](https://ollama.com/kylebrodeur/microfactory-node-v3-qat:q4_0) | [`microfactory-node-lora-v3-qat`](https://huggingface.co/kylebrodeur/microfactory-node-lora-v3-qat) |
| `microfactory-node-v2.gguf` | q4_k_m | 5.1 GB | [`kylebrodeur/microfactory-node-v2`](https://ollama.com/kylebrodeur/microfactory-node-v2) | [`microfactory-node-lora-v2`](https://huggingface.co/kylebrodeur/microfactory-node-lora-v2) |
| `microfactory-node.gguf` | q4_k_m | 5.1 GB | [`kylebrodeur/microfactory-node`](https://ollama.com/kylebrodeur/microfactory-node) | [`microfactory-node-lora`](https://huggingface.co/kylebrodeur/microfactory-node-lora) |
> The QAT model was trained with simulated 4-bit quantization, so it retains
> more quality after quantization than the standard v2. Use `q4_k_m` for
> balanced quality/size, or `q4_0` (the quant Google's QAT was trained for)
> for the highest fidelity reconstruction of the QAT model.
## Run with Ollama (public registry β€” easiest)
```bash
# recommended
ollama run kylebrodeur/microfactory-node-v3-qat
# QAT-native quant
ollama run kylebrodeur/microfactory-node-v3-qat:q4_0
# other variants
ollama run kylebrodeur/microfactory-node-v2
ollama run kylebrodeur/microfactory-node
```
## Run with Ollama (this HF repo β€” no download step)
Ollama can pull GGUFs directly from HF β€” the `template`, `system`, and `params`
files in this repo configure the Gemma 4 chat template, the Chief Engineer
system prompt, and tuned sampling automatically:
```bash
ollama run hf.co/kylebrodeur/microfactory-node-gguf:microfactory-node-v3-qat.gguf
ollama run hf.co/kylebrodeur/microfactory-node-gguf:microfactory-node-v3-qat-q4_0.gguf
ollama run hf.co/kylebrodeur/microfactory-node-gguf:microfactory-node-v2.gguf
ollama run hf.co/kylebrodeur/microfactory-node-gguf:microfactory-node.gguf
```
See the [HF Γ— Ollama docs](https://huggingface.co/docs/hub/en/ollama) for the
`hf.co/...` URI form and how Ollama discovers the auxiliary config files.
## Run with llama.cpp
```bash
hf download kylebrodeur/microfactory-node-gguf microfactory-node-v3-qat.gguf --local-dir .
llama-cli -m microfactory-node-v3-qat.gguf -p "PLA overhang at 22C, 45% humidity"
```
## Use the live demo
The Hugging Face Space [`build-small-hackathon/microfactory-lab`](https://huggingface.co/spaces/build-small-hackathon/microfactory-lab)
runs the full Chief Engineer UI against these adapters (ZeroGPU + a Modal-hosted
OpenAI-compatible endpoint as fallback). Source repo:
[`kylebrodeur/microfactory-lab`](https://github.com/kylebrodeur/microfactory-lab).
The full conversion + publishing pipeline (LoRA β†’ Modal merge β†’ llama.cpp
quantize β†’ HF Hub β†’ ollama.com) is documented in
[`learn/finetune/OLLAMA_PUBLISHING.md`](https://github.com/kylebrodeur/microfactory-lab/blob/main/chief-engineer/learn/finetune/OLLAMA_PUBLISHING.md).