Instructions to use zeon01/what-changed-1b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use zeon01/what-changed-1b with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="zeon01/what-changed-1b", filename="MiniCPM5-1B.Q4_K_M.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use zeon01/what-changed-1b with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf zeon01/what-changed-1b:Q4_K_M # Run inference directly in the terminal: llama-cli -hf zeon01/what-changed-1b:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf zeon01/what-changed-1b:Q4_K_M # Run inference directly in the terminal: llama-cli -hf zeon01/what-changed-1b:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf zeon01/what-changed-1b:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf zeon01/what-changed-1b:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf zeon01/what-changed-1b:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf zeon01/what-changed-1b:Q4_K_M
Use Docker
docker model run hf.co/zeon01/what-changed-1b:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use zeon01/what-changed-1b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "zeon01/what-changed-1b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "zeon01/what-changed-1b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/zeon01/what-changed-1b:Q4_K_M
- Ollama
How to use zeon01/what-changed-1b with Ollama:
ollama run hf.co/zeon01/what-changed-1b:Q4_K_M
- Unsloth Studio
How to use zeon01/what-changed-1b with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for zeon01/what-changed-1b to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for zeon01/what-changed-1b to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for zeon01/what-changed-1b to start chatting
- Pi
How to use zeon01/what-changed-1b with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf zeon01/what-changed-1b:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "zeon01/what-changed-1b:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use zeon01/what-changed-1b with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf zeon01/what-changed-1b:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default zeon01/what-changed-1b:Q4_K_M
Run Hermes
hermes
- Atomic Chat new
- Docker Model Runner
How to use zeon01/what-changed-1b with Docker Model Runner:
docker model run hf.co/zeon01/what-changed-1b:Q4_K_M
- Lemonade
How to use zeon01/what-changed-1b with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull zeon01/what-changed-1b:Q4_K_M
Run and chat with the model
lemonade run user.what-changed-1b-Q4_K_M
List all available models
lemonade list
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf zeon01/what-changed-1b:Q4_K_M# Run inference directly in the terminal:
llama-cli -hf zeon01/what-changed-1b:Q4_K_MUse pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf zeon01/what-changed-1b:Q4_K_M# Run inference directly in the terminal:
./llama-cli -hf zeon01/what-changed-1b:Q4_K_MBuild from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf zeon01/what-changed-1b:Q4_K_M# Run inference directly in the terminal:
./build/bin/llama-cli -hf zeon01/what-changed-1b:Q4_K_MUse Docker
docker model run hf.co/zeon01/what-changed-1b:Q4_K_MWhat Changed โ Parkinson's caregiver-note extractor (fine-tuned MiniCPM5-1B)
A 1.08B model fine-tuned to read a family caregiver's daily free-text note about a parent with Parkinson's disease and return structured symptom fields as JSON. Built for the Build Small Hackathon and designed to run fully locally via llama.cpp / GGUF โ the whole point is that health data logged at home never leaves the device.
โถ๏ธ Try it live: the What Changed app on Spaces โ daily logging (typed or spoken), trend charts, and a one-page doctor report, all running this model on-device.
Not medical advice / not a medical device. It organizes a caregiver's own day-to-day observations into a structured form to share with a clinician. It does not diagnose, predict, or recommend treatment.
What it does
Input โ a short caregiver note:
"Mom moved with more struggle today, froze up twice, and her meds wore off before lunch."
Output โ compact JSON over a fixed schema:
{"mobility": 2, "freezing_episodes": 2, "off_episodes": 1}
- Six daily ratings, 1โ5 where 5 = best/healthiest:
mobility,tremor,stiffness,mood,sleep,alertness. - Six event counts:
off_episodes,dyskinesia_spells,freezing_episodes,falls,missed_or_late_meds,hallucinations.
Only fields the note actually mentions appear. The schema is aligned with the Hauser PD Home Diary (ON/OFF + dyskinesia) and MDS-UPDRS motor/non-motor items. The model does exactly one narrow job (note โ JSON); all trend detection is deterministic Python downstream, so the model never touches the reasoning that drives the report.
Results โ field-F1 (exact / within-ยฑ1)
| Eval set | exact | tol (ยฑ1) |
|---|---|---|
| In-distribution (200 held-out) | 0.97 | 0.99 |
| Out-of-distribution probe (25 deliberately-adversarial styles) | 0.83 | 0.91 |
| Base MiniCPM5-1B, zero-shot | ~0.00 | ~0.08 |
The base model can't do the task untrained; the capability is entirely from fine-tuning. (ยฑ1 calibration on a 1โ5 scale is subjective and irrelevant to trend detection, so the tol column is the one to trust.)
Training
- Method: LoRA (r=16, ฮฑ=16) via Unsloth, bf16, 2 epochs, ~80 s on a single A10G. Same RAW-completion prompt/parser at train, eval, and serve.
- Data: 1,296 synthetic
note โ labelpairs, generated label-first (sample a ground-truth label, then have a teacher write a faithful caregiver note โ labels correct by construction). Teacher: Claude Sonnet 4.6, under a strict third-person diary contract, plus a style-diverse robustness top-up (negation, terse/txt, slang, idioms).
Running locally
Quantizations โ field-F1 measured through llama.cpp on 80 held-out notes (the real serving path):
| file | size | exact | tol (ยฑ1) |
|---|---|---|---|
MiniCPM5-1B.Q8_0.gguf โ
recommended |
1.15 GB | 0.95 | 0.99 |
MiniCPM5-1B.Q4_K_M.gguf smaller |
0.69 GB | 0.90 | 0.98 |
Q8_0 is the pick โ it matches full precision (F16) at half the size. Q4_K_M trades ~5 points of exact-match for ~0.46 GB if you need it on a very constrained device.
# llama.cpp CLI
llama-cli -m MiniCPM5-1B.Q8_0.gguf -n 96 -p "<build_extract_prompt(note)>"
# llama-cpp-python โ RAW completion, matches training/eval/serve
from llama_cpp import Llama
llm = Llama(model_path="MiniCPM5-1B.Q8_0.gguf", n_ctx=1024)
out = llm.create_completion(build_extract_prompt(note), max_tokens=96, temperature=0.0)
Recommended: constrain decoding with the project's GBNF grammar so output is always valid schema JSON (correct keys, integer 1โ5 / counts) โ this also fixes rare out-of-schema event slang at decode time.
Prompt format
RAW completion (no chat template): the app's build_extract_prompt(note) is fed directly and
the model emits the JSON object. The exact template + parser live in the project repo.
Limitations
- Trained on synthetic data and never validated on real caregiver notes (real notes are exactly the PII the project refuses to collect). The OOD probe is the best available proxy.
- Very abbreviated or heavily typo'd phrasings are the weakest spot; out-of-schema event slang is handled at serve time by the GBNF grammar, not the weights.
- English only.
- Downloads last month
- 64
4-bit
8-bit
Model tree for zeon01/what-changed-1b
Base model
openbmb/MiniCPM5-1B
Install from brew
# Start a local OpenAI-compatible server with a web UI: llama-server -hf zeon01/what-changed-1b:Q4_K_M# Run inference directly in the terminal: llama-cli -hf zeon01/what-changed-1b:Q4_K_M