SlowestLooser FoodJSON Unified v5

QLoRA fine-tune of mlx-community/Qwen3-1.7B-4bit for the SlowestLooser on-device calorie-tracking app. Handles food, drink, and activity queries under one unified JSON schema.

What this model does

The model is a pure RAG → JSON transformer: given a user query plus a pre-resolved DB block, it copies values verbatim into the output schema. The app pre-computes grams / volume_ml / minutes / calories iOS-side and passes them in the prompt; the model never invents numbers.

Output schema

{
  "name": "<user input verbatim>",
  "items": [
    {
      "type": "food" | "drink" | "activity",
      "name": "...",
      "grams": int, "volume_ml": int, "minutes": int,
      "calories": int,
      "protein": float, "carbs": float, "sugar": float,
      "fat": float, "saturated_fat": float, "salt": float
    }
  ]
}
  • food: grams > 0, volume_ml = 0, minutes = 0
  • drink: volume_ml > 0, grams = 0, minutes = 0
  • activity: minutes > 0, all macros = 0

Training

Setting Value
Method QLoRA via mlx-lm on M4 Pro
Iterations 2000
Batch size 4
LR schedule linear warmup 0 → 1e-4 over 50 iters, cosine decay → 1e-5 by iter 2000
Optimizer AdamW
LoRA rank 32
LoRA scale 20.0
LoRA target layers last 16 attention layers
Max seq length 2048
Trainable params 19.92M (1.158% of base)

Loss

Iter Val loss
1 2.518
500 0.041
1000 0.043
1500 0.047
2000 0.040

Eval (holdout, 475 records)

Metric Value
parse_rate 100.0%
top1_acc 100.0%
quantity_acc 100.0%
multi_recall 100.0%
avg latency 1.63s

Predecessors

  • v1 — food-only fine-tune (Qwen3-1.7B-4bit), 1000 iters, val 0.053. Superseded by v2.
  • v2 — combined food + activity with separate schemas (Juhuu/slowestlooser-v2-4bit), 1500 iters, val 0.046. Currently in iOS production.
  • v3 — abandoned (food-side fixes regressed activity quality at fixed LR).
  • v4 — same unified schema as v5 but flat LR 2e-4 — gradient divergence at iter 1475 (train loss 0.5 → 2.1 in 25 iters), best saved snapshot at iter 1250 unshippable (eval parse_rate 42.3%).
  • v5 — cosine LR + warmup fixed the divergence. Production candidate.

Caveats

  • Holdout is drawn from the same distribution as training. 100% on holdout does NOT mean 100% in production. The real test is the iOS-side QualitySpec.all matrix sweep (60 hand-crafted prompts) on real device.
  • The system prompt and RAG-block format must be byte-equal between training and runtime. Source of truth: prompt_v4.py::SYSTEM_V4 in the slowestlooser-finetune repo.

Usage

The model is used by SlowestLooser iOS app via mlx-swift-lm. iOS-side pipeline:

user query → IngredientSplitter → per-part DB lookup (curated catalog ⊕
OFF VectorDB) → UnifiedRAGBlockBuilder → model.generate → JSON parse
Downloads last month
17
Safetensors
Model size
0.3B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Juhuu/slowestlooser-v5-4bit

Finetuned
Qwen/Qwen3-1.7B
Adapter
(4)
this model