Instructions to use Juhuu/slowestlooser-v5-4bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Juhuu/slowestlooser-v5-4bit with MLX:

# Make sure mlx-lm is installed
# pip install --upgrade mlx-lm

# Generate text with mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("Juhuu/slowestlooser-v5-4bit")

prompt = "Write a story about Einstein"
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True
)

text = generate(model, tokenizer, prompt=prompt, verbose=True)

Notebooks
Google Colab
Kaggle
Local Apps Settings
LM Studio

How to use Juhuu/slowestlooser-v5-4bit with Pi:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "Juhuu/slowestlooser-v5-4bit"

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "mlx-lm": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "Juhuu/slowestlooser-v5-4bit"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use Juhuu/slowestlooser-v5-4bit with Hermes Agent:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "Juhuu/slowestlooser-v5-4bit"

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default Juhuu/slowestlooser-v5-4bit

Run Hermes

hermes

OpenClaw new

How to use Juhuu/slowestlooser-v5-4bit with OpenClaw:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "Juhuu/slowestlooser-v5-4bit"

Configure OpenClaw

# Install OpenClaw:
npm install -g openclaw@latest
# Register the local server and set it as the default model:
openclaw onboard --non-interactive --mode local \
  --auth-choice custom-api-key \
  --custom-base-url http://127.0.0.1:8080/v1 \
  --custom-model-id "Juhuu/slowestlooser-v5-4bit" \
  --custom-provider-id mlx-lm \
  --custom-compatibility openai \
  --custom-text-input \
  --accept-risk \
  --skip-health

Run OpenClaw

openclaw agent --local --agent main --message "Hello from Hugging Face"

MLX LM

How to use Juhuu/slowestlooser-v5-4bit with MLX LM:

Generate or start a chat session

# Install MLX LM
uv tool install mlx-lm
# Interactive chat REPL
mlx_lm.chat --model "Juhuu/slowestlooser-v5-4bit"

Run an OpenAI-compatible server

# Install MLX LM
uv tool install mlx-lm
# Start the server
mlx_lm.server --model "Juhuu/slowestlooser-v5-4bit"
# Calling the OpenAI-compatible server with curl
curl -X POST "http://localhost:8000/v1/chat/completions" \
   -H "Content-Type: application/json" \
   --data '{
     "model": "Juhuu/slowestlooser-v5-4bit",
     "messages": [
       {"role": "user", "content": "Hello"}
     ]
   }'

SlowestLooser FoodJSON Unified v5

QLoRA fine-tune of mlx-community/Qwen3-1.7B-4bit for the SlowestLooser on-device calorie-tracking app. Handles food, drink, and activity queries under one unified JSON schema.

What this model does

The model is a pure RAG → JSON transformer: given a user query plus a pre-resolved DB block, it copies values verbatim into the output schema. The app pre-computes grams / volume_ml / minutes / calories iOS-side and passes them in the prompt; the model never invents numbers.

Output schema

{
  "name": "<user input verbatim>",
  "items": [
    {
      "type": "food" | "drink" | "activity",
      "name": "...",
      "grams": int, "volume_ml": int, "minutes": int,
      "calories": int,
      "protein": float, "carbs": float, "sugar": float,
      "fat": float, "saturated_fat": float, "salt": float
    }
  ]
}

food: grams > 0, volume_ml = 0, minutes = 0
drink: volume_ml > 0, grams = 0, minutes = 0
activity: minutes > 0, all macros = 0

Training

Setting	Value
Method	QLoRA via mlx-lm on M4 Pro
Iterations	2000
Batch size	4
LR schedule	linear warmup 0 → 1e-4 over 50 iters, cosine decay → 1e-5 by iter 2000
Optimizer	AdamW
LoRA rank	32
LoRA scale	20.0
LoRA target layers	last 16 attention layers
Max seq length	2048
Trainable params	19.92M (1.158% of base)

Loss

Iter	Val loss
1	2.518
500	0.041
1000	0.043
1500	0.047
2000	0.040

Eval (holdout, 475 records)

Metric	Value
parse_rate	100.0%
top1_acc	100.0%
quantity_acc	100.0%
multi_recall	100.0%
avg latency	1.63s

Predecessors

v1 — food-only fine-tune (Qwen3-1.7B-4bit), 1000 iters, val 0.053. Superseded by v2.
v2 — combined food + activity with separate schemas (Juhuu/slowestlooser-v2-4bit), 1500 iters, val 0.046. Currently in iOS production.
v3 — abandoned (food-side fixes regressed activity quality at fixed LR).
v4 — same unified schema as v5 but flat LR 2e-4 — gradient divergence at iter 1475 (train loss 0.5 → 2.1 in 25 iters), best saved snapshot at iter 1250 unshippable (eval parse_rate 42.3%).
v5 — cosine LR + warmup fixed the divergence. Production candidate.

Caveats

Holdout is drawn from the same distribution as training. 100% on holdout does NOT mean 100% in production. The real test is the iOS-side QualitySpec.all matrix sweep (60 hand-crafted prompts) on real device.
The system prompt and RAG-block format must be byte-equal between training and runtime. Source of truth: prompt_v4.py::SYSTEM_V4 in the slowestlooser-finetune repo.

Usage

The model is used by SlowestLooser iOS app via mlx-swift-lm. iOS-side pipeline:

user query → IngredientSplitter → per-part DB lookup (curated catalog ⊕
OFF VectorDB) → UnifiedRAGBlockBuilder → model.generate → JSON parse

Downloads last month: 3

Safetensors

Model size

0.3B params

Tensor type

BF16

U32

MLX

Hardware compatibility

4-bit

Model tree for Juhuu/slowestlooser-v5-4bit

Base model

Qwen/Qwen3-1.7B-Base

Finetuned

Qwen/Qwen3-1.7B

Quantized

mlx-community/Qwen3-1.7B-4bit

Adapter

(4)

this model