Instructions to use Juhuu/slowestlooser-v6-4bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use Juhuu/slowestlooser-v6-4bit with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("Juhuu/slowestlooser-v6-4bit") prompt = "Write a story about Einstein" messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
- Pi new
How to use Juhuu/slowestlooser-v6-4bit with Pi:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "Juhuu/slowestlooser-v6-4bit"
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "mlx-lm": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "Juhuu/slowestlooser-v6-4bit" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use Juhuu/slowestlooser-v6-4bit with Hermes Agent:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "Juhuu/slowestlooser-v6-4bit"
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default Juhuu/slowestlooser-v6-4bit
Run Hermes
hermes
- MLX LM
How to use Juhuu/slowestlooser-v6-4bit with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Interactive chat REPL mlx_lm.chat --model "Juhuu/slowestlooser-v6-4bit"
Run an OpenAI-compatible server
# Install MLX LM uv tool install mlx-lm # Start the server mlx_lm.server --model "Juhuu/slowestlooser-v6-4bit" # Calling the OpenAI-compatible server with curl curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Juhuu/slowestlooser-v6-4bit", "messages": [ {"role": "user", "content": "Hello"} ] }'
SlowestLooser FoodJSON Unified v6
QLoRA fine-tune of Qwen/Qwen3-0.6B for the SlowestLooser on-device
calorie-tracking iOS app. Handles food, drink, and activity queries
under one unified JSON schema. Fused + 4-bit MLX bundle, ~330 MB.
What this model does
Pure RAG → JSON transformer. iOS pre-resolves each food/drink/activity
item via the on-device DB and passes pre-scaled values in a
DB (use verbatim): block. The model copies values into the schema.
The model never invents numbers.
Output schema
{
"name": "<user input verbatim>",
"items": [
{
"type": "food" | "drink" | "activity",
"name": "...",
"grams": int, "volume_ml": int, "minutes": int,
"calories": int,
"protein": float, "carbs": float, "sugar": float,
"fat": float, "saturated_fat": float, "salt": float
}
]
}
- food:
grams > 0,volume_ml = 0,minutes = 0 - drink:
volume_ml > 0,grams = 0,minutes = 0 - activity:
minutes > 0, all macros = 0 - garbage / sarcasm / "skipped dinner":
items: []
Training
| Setting | Value |
|---|---|
| Method | Unsloth QLoRA on Lambda A100 SXM4 40GB |
| Iterations | 3000 |
| Batch size | 16 |
| LR schedule | linear warmup 50 steps → 1e-4, cosine decay to 1e-5 |
| Optimizer | AdamW (8-bit via bitsandbytes) |
| LoRA rank | 32 |
| LoRA scale | 20.0 |
| Target modules | q/k/v/o + gate/up/down proj |
| Trainable params | ~20M (3.5% of base) |
| Wall time | 27 min |
Eval — v6 adversarial Tier B/C suite (122 records, frozen)
Tier B (in-distribution + near-OOD): 68.9% pass
Tier C (garbage / sarcasm / OOD): 71.9% pass
Cross-version comparison (same eval suite):
| Model | Tier B pass | Tier C pass | Garbage refusal |
|---|---|---|---|
| base Qwen3-0.6B (no FT) | 8.9% | 15.6% | 25.0% |
| v6 Qwen3-0.6B (this) | 68.9% | 71.9% | 100.0% |
| v6 Gemma 3 270M | 44.4% | 21.9% | 25.0% |
| v6 Qwen3.5-2B | 57.8% | 65.6% | 87.5% |
| v5 Qwen3-1.7B (predecessor) | 76.7% | 37.5% | 25.0% |
vs v5: 8 pp drop on in-distribution Tier B, but +37 pp on Tier C garbage/sarcasm/OOD refusal and 60% smaller on-device footprint.
Dataset
- 6190 records total (5473 train + 617 holdout + 100 optuna_eval)
- Sources: 131-entry Python curated catalog + 24,853 cleaned OFF entries + 90 negative examples (garbage/sarcasm/negation)
- Distribution: 49% food / 16% drink / 20% activity / 16% mixed-2-3 / 13% multi-3-5 / 3% multi-6+ / 1.5% negative
Predecessor lineage
- v1, v2 — food-only / two-schema food+activity (1.7B)
- v3, v4 — abandoned (architectural attempts)
- v5 — first unified items[] schema (1.7B, val 0.040). Currently
Juhuu/slowestlooser-v5-4bit. Stays available as legacy. - v6 — first model with negative-example training; smaller footprint (0.6B); dramatic refusal-behavior lift. This model.
Known limitations (deferred to v7)
"2 Eier","3 Brötchen"— count units not extracted (0% pass)- 6+ ingredient lists collapse (0% pass)
"intensives Joggen"— intensity adverbs not modulating MET (0% pass)- Swiss German activity dialect (
"Wandere im Bärgli"— 0% pass) Qwen3.5-2Bfine-tune attempted but didn't lift over base (worth v7 retry on larger GPU)
Notes on the MLX bundle
- Fused fp16 → quantized to 4-bit MLX (group size 64, ~4.5 bits/weight)
config.jsonwas patched at conversion time to exposerope_thetaat top level (Qwen3 transformers 5.x usesrope_parametersdict; mlx_lm 0.31.3 doesn't yet read that)
- Downloads last month
- 25
4-bit