How to use from
llama.cpp
Install (macOS, Linux)
curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf SwarmandBee/DiabeticDaily-4B:Q4_K_M
# Run inference directly in the terminal:
llama cli -hf SwarmandBee/DiabeticDaily-4B:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf SwarmandBee/DiabeticDaily-4B:Q4_K_M
# Run inference directly in the terminal:
llama cli -hf SwarmandBee/DiabeticDaily-4B:Q4_K_M
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf SwarmandBee/DiabeticDaily-4B:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf SwarmandBee/DiabeticDaily-4B:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf SwarmandBee/DiabeticDaily-4B:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf SwarmandBee/DiabeticDaily-4B:Q4_K_M
Use Docker
docker model run hf.co/SwarmandBee/DiabeticDaily-4B:Q4_K_M
Quick Links

DiabeticDaily-4B 🐝🛏️

The edge tier of the OpenDiabetic ladder — runs on a $249 Jetson Orin Nano, on-box, zero internet. A proven, domain-tuned diabetic assistant small enough to sit on a nightstand. Cooked by Swarm and Bee LLC.

Beat-base — proven

Held-out perplexity vs base Qwen3.5-4B (text never trained on):

held-out loss perplexity
Base Qwen3.5-4B 1.5062 4.510
DiabeticDaily-4B 0.8982 2.455
Δ −0.608 (+40.4% better)

Verdict: BEAT BASE ✅. A 4B that models diabetic/medical language ~40% better than base — and at Q4 it's ~2.6GB, running at usable speed on a Jetson with PHI never leaving the box.

How it was cooked

  • Base: Qwen/Qwen3.5-4B (Apache-2.0). Data: the same deeded OpenDiabetic corpus as the 27B/9B.
  • Recipe: LoRA r32/α16 on attn+mlp, LR 2e-5 (small-model tier), 0.7ep, early-stop. Merged bf16.

Run it on a Jetson (Q4 GGUF, ollama) — see the -GGUF companion repo

ollama create diabetic-daily -f Modelfile   # FROM diabeticedge-4b-q4_k_m.gguf
ollama run diabetic-daily "What's a good diabetic breakfast?"

This is the brain behind the LocalDiabetic edge node — sovereign, private, free.

The ladder: 🐝 27B anchor (+57%) → 🏠 9B home (+40.7%) → 🛏️ 4B edge (+40.4%)

⚠️ Not medical advice — diabetic lifestyle/education/organization only. Not a diagnosis. Emergencies → 911.

© 2026 Swarm and Bee LLC · opendiabetic.com · Apache-2.0 · We slow cook the truth. 🐝

Downloads last month
91
Safetensors
Model size
5B params
Tensor type
BF16
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for SwarmandBee/DiabeticDaily-4B

Finetuned
Qwen/Qwen3.5-4B
Quantized
(282)
this model