SlowestLooser FoodJSON+Activity v2

HuggingFace: https://huggingface.co/Juhuu/slowestlooser-v2-4bit Published: 2026-05-13 Status: Active. Supersedes v1 (food-only).

What this model does

QLoRA fine-tune of mlx-community/Qwen3-1.7B-4bit that handles both the SlowestLooser food-text path AND the activity-text path. iOS routes by system prompt; the model emits the matching JSON schema.

Food queries (system: foodAnalysisSystemNoTools): CompositeDTO JSON with items[]
Activity queries (system: activityAnalysisSystem with embedded 40-entry MET catalog): Single object {"name", "minutes", "calories"} (multi-activity sums into one row)

Base model

mlx-community/Qwen3-1.7B-4bit (1.72B params, pre-quantized 4-bit MXFP)

Training (QLoRA via mlx-lm on M4 Pro)

Setting	Value
Method	QLoRA
Optimizer	AdamW
Iterations	1500 (vs v1's 1000)
Batch size	4
Learning rate	2e-4
LoRA rank	32
LoRA target layers	last 16 attention layers
Max seq length	2048
Trainable params	4.98M (0.29% of base)
Wall time	135 min
Peak memory	16.4 GB

Loss trajectory

Iter	Val loss	Train loss
1	2.892	—
500	0.068	0.073
1000	0.061	0.056
1500	0.046	0.047

Val loss still improving at iter 1500 — extending from v1's 1000 was worth it. v1 plateaued at val=0.053 (iter 900).

Dataset

4802 records (4321 train + 481 holdout).

Slice	Records	Share
Food	3644	84%
Activity	677	16%

Food dataset (unchanged from v1):

38 curated DB entries oversampled to 20 variants each
400 simple-label OFF entries × 4 variants
30% multi-ingredient queries; 60% include ≥1 curated seed

Activity dataset (new in v2):

40-entry curated MET catalog (daily life → endurance sports; broad → specific)
15 variants per activity via GPT-4o-mini synthesis (durations, intensity modifiers, Swiss German + English synonyms)
10% multi-activity composites
Calorie targets computed deterministically: MET × 70kg × hours

See data/activity_catalog.json for the full 40-entry MET catalog used in both training data and the iOS system prompt.

Eval: Matrix sweep against iOS QualitySpec (real production prompts)

Real prompts mirrored from SlowestLooser/Services/QualitySpecs.swift::QualitySpec.all.

Food (39 prompts)

	Base Qwen3 1.7B	v1 FT	v2 FT
Valid JSON	1/39 (3%)	39/39 (100%)	39/39 (100%)
Avg latency	3.10s	1.69s	~2.4s

Activity (19 prompts; one empty-string prompt skipped)

	v2 FT
Valid JSON	19/19 (100%)
Avg latency	~0.7s (no DB prefetch needed)

Single-activity calorie sanity

The v2 model emits deterministic, MET-formula-correct calories:

Prompt	Output	Expected	Status
Joggen / 30 min Joggen	30min / 245kcal	7.0 × 70 × 0.5 = 245	exact
Velofahren	30min / 210kcal	6.0 × 70 × 0.5 = 210	exact
Schwimmen	30min / 280kcal	8.0 × 70 × 0.5 = 280	exact
HIIT	30min / 350kcal	10.0 × 70 × 0.5 = 350	exact
Sprinten	30min / 420kcal	12.0 × 70 × 0.5 = 420	exact
1 Stunde Velo	60min / 420kcal	6.0 × 70 × 1.0 = 420	exact
45 Minuten Schwimmen	45min / 385kcal	8.0 × 70 × 0.75 = 420	~ 9% under
2h Wandern	120min / 700kcal	5.5 × 70 × 2 = 770	~ 9% under
Spazieren	30min / 122kcal	3.5 × 70 × 0.5 = 122.5	exact

Multi-activity calorie sums

Less precise — model interpolates between known activities:

Prompt	v2 Output	Catalog-strict sum
Joggen und Schwimmen	60min / 595kcal	245 + 280 = 525
30 min Velo dann 15 min Joggen	60min / 560kcal	210 + 122.5 = 332.5

Acceptable for ship; could tighten in v3 with more multi-activity training samples.

Drift comparison (the v1 production motivation)

Activity calorie variance for "30 min Joggen" in v1.30.x production releases:

Release	Output	Source
v1.30.1	168	Base Qwen
v1.30.3	147	OpenAI (consistent across runs)
v1.30.4	140	Base Qwen
v1.30.5	110	Base Qwen — drift continued
v2 FT	245	Deterministic

v2 emits the canonical MET-formula answer (245) on every run.

Known issues

g_wasser filtering (carried over from v1) — water entry with 0 kcal gets filtered by load_vector_index. Query "Wasser" matches "Birken Wasser" from BM25. Relax to cals < 0 in v3.
dl unit not parsed (carried over from v1) — 1.5dl Milch embeds quantity in name. Add dl/cl to QuantityParser.
Multi-activity calorie sum is approximate — see table above. ~12% over on tested cases. Acceptable for ship but could tighten with more multi-activity training data.
2h Wandern slight under-estimate — model used MET=5 instead of catalog 5.5 (700 vs 770 kcal).

Notes on design

Design B chosen for activity MET routing: 40-entry catalog embedded in system prompt (compact Name=MET comma-separated, ~250 tokens). Avoided full RAG (Design C) — Swift-side activity matcher would have added infrastructure for a 40-entry static dataset.
MET catalog Python ↔ Swift byte-equal: prompt.py::SYSTEM_ACTIVITY and PromptTemplates.swift::activityAnalysisSystem are intentionally byte-equal. Training prompt = production prompt = no drift.
One combined model (vs two separate adapters): saves a download + a model load on iOS. The 1.7B has capacity for both schemas.

Commits

Activity training + model card scaffolding: b4ee122
40-entry catalog + Design B compact MET prompt: 8011d10 (Python) + a6a4eef (iOS)
v2 dataset + training: (commit hash after this docs commit)

Why retired v1

v1 was food-only. Users were typing "30 min joggen" and getting drifting calorie values (168/140/110 across releases). v2 trains one combined model on both schemas with deterministic calorie math baked in.

Downloads last month: -

Safetensors

Model size

0.3B params

Tensor type

BF16

U32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support