How to use from
llama.cpp
Install (macOS, Linux)
curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf flywheel-ai/construction:Q4_K_M
# Run inference directly in the terminal:
llama cli -hf flywheel-ai/construction:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf flywheel-ai/construction:Q4_K_M
# Run inference directly in the terminal:
llama cli -hf flywheel-ai/construction:Q4_K_M
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf flywheel-ai/construction:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf flywheel-ai/construction:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf flywheel-ai/construction:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf flywheel-ai/construction:Q4_K_M
Use Docker
docker model run hf.co/flywheel-ai/construction:Q4_K_M
Quick Links

Flywheel — construction (35b-v1.1)

An open-source vertical AI-employee model from Flywheel by OpSpot, fine-tuned (LoRA) from Qwen/Qwen3.6-35B-A3B (Apache-2.0) for the construction domain.

Practical construction and trades assistant: estimating and takeoffs, materials and methods, code and permit awareness, scheduling and sequencing, and jobsite safety.

  • Base: Qwen/Qwen3.6-35B-A3B · License: Apache-2.0 · Version: 35b-v1.1
  • Formats: safetensors (transformers / vLLM, ~65G) + model-q4_k_m.gguf (llama.cpp / Ollama, ~20G)

Download (one command)

pip install -U huggingface_hub
hf download flywheel-ai/construction                      # full repo (safetensors + GGUF)
hf download flywheel-ai/construction model-q4_k_m.gguf    # just the GGUF

Run

# llama.cpp
llama-server -m model-q4_k_m.gguf -ngl 999
# Ollama (pulls the GGUF straight from HF)
ollama run hf.co/flywheel-ai/construction
# vLLM (serves the safetensors)
vllm serve flywheel-ai/construction

Guardrail

Not a substitute for a licensed professional; defer to a licensed engineer, inspector, or local code for structural, electrical, permit, and safety-critical decisions.

Provenance & honesty

v1.0 is trained on synthetic seed data authored by permissively-licensed local models (Apache/MIT teachers only — never distilled from closed models). On general prompts it is roughly on par with the base; the niche edge sharpens as consented real usage flows through the OpSpot flywheel. Built on Qwen3.6 (Apache-2.0).

Downloads last month
126
Safetensors
Model size
35B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for flywheel-ai/construction

Quantized
(537)
this model