How to use from
llama.cpp
Install from brew
brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf zeon01/what-changed-1b:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf zeon01/what-changed-1b:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf zeon01/what-changed-1b:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf zeon01/what-changed-1b:Q4_K_M
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf zeon01/what-changed-1b:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf zeon01/what-changed-1b:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf zeon01/what-changed-1b:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf zeon01/what-changed-1b:Q4_K_M
Use Docker
docker model run hf.co/zeon01/what-changed-1b:Q4_K_M
Quick Links

What Changed โ€” Parkinson's caregiver-note extractor (fine-tuned MiniCPM5-1B)

A 1.08B model fine-tuned to read a family caregiver's daily free-text note about a parent with Parkinson's disease and return structured symptom fields as JSON. Built for the Build Small Hackathon and designed to run fully locally via llama.cpp / GGUF โ€” the whole point is that health data logged at home never leaves the device.

โ–ถ๏ธ Try it live: the What Changed app on Spaces โ€” daily logging (typed or spoken), trend charts, and a one-page doctor report, all running this model on-device.

Not medical advice / not a medical device. It organizes a caregiver's own day-to-day observations into a structured form to share with a clinician. It does not diagnose, predict, or recommend treatment.

What it does

Input โ€” a short caregiver note:

"Mom moved with more struggle today, froze up twice, and her meds wore off before lunch."

Output โ€” compact JSON over a fixed schema:

{"mobility": 2, "freezing_episodes": 2, "off_episodes": 1}
  • Six daily ratings, 1โ€“5 where 5 = best/healthiest: mobility, tremor, stiffness, mood, sleep, alertness.
  • Six event counts: off_episodes, dyskinesia_spells, freezing_episodes, falls, missed_or_late_meds, hallucinations.

Only fields the note actually mentions appear. The schema is aligned with the Hauser PD Home Diary (ON/OFF + dyskinesia) and MDS-UPDRS motor/non-motor items. The model does exactly one narrow job (note โ†’ JSON); all trend detection is deterministic Python downstream, so the model never touches the reasoning that drives the report.

Results โ€” field-F1 (exact / within-ยฑ1)

Eval set exact tol (ยฑ1)
In-distribution (200 held-out) 0.97 0.99
Out-of-distribution probe (25 deliberately-adversarial styles) 0.83 0.91
Base MiniCPM5-1B, zero-shot ~0.00 ~0.08

The base model can't do the task untrained; the capability is entirely from fine-tuning. (ยฑ1 calibration on a 1โ€“5 scale is subjective and irrelevant to trend detection, so the tol column is the one to trust.)

Training

  • Method: LoRA (r=16, ฮฑ=16) via Unsloth, bf16, 2 epochs, ~80 s on a single A10G. Same RAW-completion prompt/parser at train, eval, and serve.
  • Data: 1,296 synthetic note โ†’ label pairs, generated label-first (sample a ground-truth label, then have a teacher write a faithful caregiver note โ†’ labels correct by construction). Teacher: Claude Sonnet 4.6, under a strict third-person diary contract, plus a style-diverse robustness top-up (negation, terse/txt, slang, idioms).

Running locally

Quantizations โ€” field-F1 measured through llama.cpp on 80 held-out notes (the real serving path):

file size exact tol (ยฑ1)
MiniCPM5-1B.Q8_0.gguf โœ… recommended 1.15 GB 0.95 0.99
MiniCPM5-1B.Q4_K_M.gguf smaller 0.69 GB 0.90 0.98

Q8_0 is the pick โ€” it matches full precision (F16) at half the size. Q4_K_M trades ~5 points of exact-match for ~0.46 GB if you need it on a very constrained device.

# llama.cpp CLI
llama-cli -m MiniCPM5-1B.Q8_0.gguf -n 96 -p "<build_extract_prompt(note)>"
# llama-cpp-python โ€” RAW completion, matches training/eval/serve
from llama_cpp import Llama
llm = Llama(model_path="MiniCPM5-1B.Q8_0.gguf", n_ctx=1024)
out = llm.create_completion(build_extract_prompt(note), max_tokens=96, temperature=0.0)

Recommended: constrain decoding with the project's GBNF grammar so output is always valid schema JSON (correct keys, integer 1โ€“5 / counts) โ€” this also fixes rare out-of-schema event slang at decode time.

Prompt format

RAW completion (no chat template): the app's build_extract_prompt(note) is fed directly and the model emits the JSON object. The exact template + parser live in the project repo.

Limitations

  • Trained on synthetic data and never validated on real caregiver notes (real notes are exactly the PII the project refuses to collect). The OOD probe is the best available proxy.
  • Very abbreviated or heavily typo'd phrasings are the weakest spot; out-of-schema event slang is handled at serve time by the GBNF grammar, not the weights.
  • English only.
Downloads last month
64
GGUF
Model size
1B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for zeon01/what-changed-1b

Adapter
(17)
this model

Space using zeon01/what-changed-1b 1