KIM-Coach v3 — Gymnastics Coaching LLM

Fine-tuned Gemma 3 4B for generating motor-instruction coaching cues from motion analysis data.

What Changed in v3

Version	Training Pairs	Key Improvement	Val Loss
v1	1,538	First fine-tune, assessment-style templates	0.140 → 0.070
v2	1,538	Motor instruction templates (action verbs, feel cues)	0.110 → 0.070
v3	3,798	Directional error taxonomy + output diversity	0.110 → 0.067

Directional error taxonomy: 10 categories (insufficient_extension, over_flexion, timing_early/late, balance_loss, etc.) grounded in LucidAction penalties, USAG deductions, and real Habitude app data
2-3 output variations per input: same divergence pattern gets different coaching language, breaking template memorization
50 gold-standard cues as style anchors (hand-written by coaching framework)
Novel cue generation: model composes cues it was never explicitly trained on

v1/v2 produced verbatim copies of training data. v3 generates novel coaching cues:

Input: torso divergence during takeoff
Expected: "hips level, midline braced"
Predicted: "hips over hands, arched bridge during the takeoff — you should feel hips pushing forward"
Both are valid motor instructions — the model learned the pattern, not the template

Base model: google/gemma-3-4b-it (4-bit quantized via Unsloth)
Method: LoRA (r=16, alpha=16, dropout=0)
Data: 3,427 train / 371 val pairs from KIM VQ-VAE codec + directional error taxonomy
Training: 3 epochs, batch size 8, lr 2e-4, A100 GPU, ~77 minutes
Best val loss: 0.067 at step 1200

Safetensors

Model size

4B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Finetuned

Adapter

(281)

this model