Instructions to use caid-technologies/blueprint-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use caid-technologies/blueprint-base with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="caid-technologies/blueprint-base")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("caid-technologies/blueprint-base", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use caid-technologies/blueprint-base with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "caid-technologies/blueprint-base"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "caid-technologies/blueprint-base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/caid-technologies/blueprint-base

SGLang

How to use caid-technologies/blueprint-base with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "caid-technologies/blueprint-base" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "caid-technologies/blueprint-base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "caid-technologies/blueprint-base" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "caid-technologies/blueprint-base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Unsloth Studio

How to use caid-technologies/blueprint-base with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for caid-technologies/blueprint-base to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for caid-technologies/blueprint-base to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for caid-technologies/blueprint-base to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="caid-technologies/blueprint-base",
    max_seq_length=2048,
)

Docker Model Runner
How to use caid-technologies/blueprint-base with Docker Model Runner:
```
docker model run hf.co/caid-technologies/blueprint-base
```

Blueprint Base — Qwen2.5-3B

Blueprint turns a plain-English hardware idea into an organized project plan.

Tell it what you want to build — "a compact desk clock with an e-ink display and a remote" — and it gives back a structured blueprint: the parts list, how the parts connect, step-by-step build instructions, rough costs, and a quick design check. Everything comes out as clean, organized data that an app can read and build on.

This is the all-in-one model — it runs on its own, no add-ons needed. (There's also a small adapter-only version at blueprint-base-lora.)

Early research preview. Great for drafting and exploring ideas — not a replacement for real engineering, CAD software, or safety review.

By caid-technologies.

What it can do

Give it a hardware idea and it can produce any of:

📋 a parts list (components)
🔌 a wiring/connection map between the parts
🛠️ ordered build steps
💲 rough sourcing and cost info
✅ a basic design check
📦 or the whole project plan at once

You can ask for the complete plan, or just one piece (like only the parts list).

What it's good for — and not

✅ Good for: brainstorming hardware projects, drafting parts lists and build steps, and turning a rough idea into an organized starting plan.

🚫 Not for: final engineering decisions, real CAD models, electrical safety, or anything safety-critical. Treat the output as a helpful first draft to review, not a finished design.

Try it

from transformers import AutoModelForCausalLM, AutoTokenizer

REPO = "caid-technologies/blueprint-base"
model = AutoModelForCausalLM.from_pretrained(REPO, device_map="auto", torch_dtype="bfloat16")
tok = AutoTokenizer.from_pretrained(REPO)

msgs = [
    {"role": "system", "content":
        "You design hobbyist electronics projects. Given a request, reply with a single "
        "JSON object describing the full project. Output only the JSON."},
    {"role": "user", "content": "A compact desk clock with an e-ink display and an IR remote."},
]
inputs = tok.apply_chat_template(
    msgs, add_generation_prompt=True, return_tensors="pt", return_dict=True).to(model.device)
out = model.generate(**inputs, max_new_tokens=6144, repetition_penalty=1.1,
                     pad_token_id=tok.eos_token_id)
print(tok.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

💡 Tip: keep max_new_tokens high (≥ 6000) so long plans aren't cut off, and keep repetition_penalty=1.1 so wiring lists don't get stuck repeating. For Ollama/local apps, convert this model to GGUF with llama.cpp.

What it learned from

It was trained on about 130 hobbyist hardware projects — things like weather stations, small robots, drones, smart-home gadgets, lab tools, and audio gear — expanded into a few thousand practice examples. Everything is small, maker-style electronics-plus-hardware.

Most common project types in the training data:

Project type	Share	Examples
Test & lab instruments	~20%	function generator, Geiger counter
Smart-home / IoT gadgets	~15%	pet feeder, smart mailbox, pill dispenser
Radio, comms & networking	~9%	LoRa base station, APRS tracker, NAS
Wearables & health	~8%	sleep ring, heart-rate strap
Audio & music	~8%	synth module, guitar pedal, speaker
Robotics & motion	~7%	quadruped robot, robotic arm
Environmental sensing	~7%	air-quality monitor, weather station
Clocks & e-ink displays	~6%	word clock, e-ink calendar
Maker / fabrication tools	~5%	vinyl cutter, pen plotter
Drones & aerial	~5%	FPV drone, VTOL aircraft
Everything else	~10%	lighting, games, automotive, power

Good to know (limitations)

It's a small model, so complex, many-part projects are harder for it.
It proposes designs; it doesn't verify them. Always sanity-check before building.
It's strongest on common project types (lab tools, smart-home) and weaker on rarer ones (games, automotive).

How well it works

We tested it on projects it had never seen during training. Here's how often it produced a valid, well-structured result for each task:

Task	Valid result
🛠️ Build steps	~100%
✅ Design check	~100%
📋 Parts list	~95%
📦 Full project plan	~85–97%
🔌 Wiring map	~67%

It's strongest at build steps, design checks, and parts lists. Full end-to-end plans are close behind, and wiring maps are the hardest (and most sensitive to the repetition_penalty tip above). Figures are from held-out testing and are being finalized for the current version.

Technical details (for ML folks)

Base model: Qwen/Qwen2.5-3B-Instruct; this repo is the fine-tune merged to 16-bit (standalone, no adapter needed).
Method: QLoRA with Unsloth (LoRA r=32, alpha=32, all attention+MLP projections), then merged.
Training: 1 epoch, max_seq_len 6144, effective batch 8, lr 2e-4 (linear, 3% warmup), adamw_8bit, NEFTune α=5, loss masked to assistant turns, early stopping on eval loss
Hardware: single RTX 4070 (12 GB)
Data: synthetic dataset projected into 6 task "modes" (full plan, parts, wiring, instructions, validation); split grouped by project so none leak between train/test. ~3,242 rows; modes rebalanced (cap 350/mode) so the model doesn't coast on the easy ones.
Inference: do_sample=False, repetition_penalty≈1.1, max_new_tokens≥6000, pass the attention mask.

@misc{blueprint_base,
  title  = {Blueprint Base: Qwen2.5-3B for structured hardware project generation},
  author = {Caid Technologies},
  year   = {2026},
  howpublished = {\url{https://huggingface.co/caid-technologies}}
}

Built with Unsloth and 🤗 Transformers / PEFT / TRL.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for caid-technologies/blueprint-base

Base model

Qwen/Qwen2.5-3B

Finetuned

Qwen/Qwen2.5-3B-Instruct

Finetuned

(1375)

this model