Update model card for v3
Browse files
README.md
CHANGED
|
@@ -1,62 +1,57 @@
|
|
| 1 |
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
base_model: google/gemma-3-4b-it
|
| 4 |
tags:
|
| 5 |
-
- coaching
|
| 6 |
- gymnastics
|
| 7 |
-
-
|
|
|
|
|
|
|
| 8 |
- lora
|
| 9 |
-
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
- custom
|
| 13 |
-
language:
|
| 14 |
-
- en
|
| 15 |
-
pipeline_tag: text-generation
|
| 16 |
---
|
| 17 |
|
| 18 |
-
# KIM-
|
| 19 |
|
| 20 |
-
Fine-tuned
|
| 21 |
|
| 22 |
-
##
|
| 23 |
|
| 24 |
-
|
| 25 |
-
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
## Part of the KIM Pipeline
|
| 31 |
|
| 32 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
|
|
|
|
|
|
|
|
|
| 37 |
|
| 38 |
-
##
|
| 39 |
|
| 40 |
-
```
|
| 41 |
-
### Instruction:
|
| 42 |
-
You are a gymnastics coach. Analyze the movement comparison data and provide specific coaching feedback.
|
| 43 |
|
| 44 |
-
### Input:
|
| 45 |
-
Element: vault_handspring
|
| 46 |
-
Overall divergence: 0.176
|
| 47 |
-
Per-part divergence: torso=0.223, arms=0.195, head=0.149, legs=0.140
|
| 48 |
-
Worst segments: legs frames 11-12 (0.905), head frames 3-4 (0.846)
|
| 49 |
|
| 50 |
-
##
|
| 51 |
-
```
|
| 52 |
|
| 53 |
-
## Limitations
|
| 54 |
|
| 55 |
-
|
| 56 |
-
- Element IDs may be hallucinated
|
| 57 |
-
- Trained on synthetic data generated by the same pipeline β circular validation risk
|
| 58 |
-
- V1 proof-of-concept; not yet validated by qualified coaches
|
| 59 |
|
| 60 |
-
## Citation
|
| 61 |
|
| 62 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
|
|
|
|
|
|
| 2 |
tags:
|
|
|
|
| 3 |
- gymnastics
|
| 4 |
+
- coaching
|
| 5 |
+
- motion-analysis
|
| 6 |
+
- gemma3
|
| 7 |
- lora
|
| 8 |
+
- kim
|
| 9 |
+
license: apache-2.0
|
| 10 |
+
base_model: google/gemma-3-4b-it
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
---
|
| 12 |
|
| 13 |
+
# KIM-Coach v3 β Gymnastics Coaching LLM
|
| 14 |
|
| 15 |
+
Fine-tuned Gemma 3 4B for generating motor-instruction coaching cues from motion analysis data.
|
| 16 |
|
| 17 |
+
## What Changed in v3
|
| 18 |
|
| 19 |
+
| Version | Training Pairs | Key Improvement | Val Loss |
|
| 20 |
+
|---------|---------------|-----------------|----------|
|
| 21 |
+
| v1 | 1,538 | First fine-tune, assessment-style templates | 0.140 β 0.070 |
|
| 22 |
+
| v2 | 1,538 | Motor instruction templates (action verbs, feel cues) | 0.110 β 0.070 |
|
| 23 |
+
| **v3** | **3,798** | **Directional error taxonomy + output diversity** | **0.110 β 0.067** |
|
|
|
|
|
|
|
| 24 |
|
| 25 |
+
### v3 Improvements
|
| 26 |
+
- **Directional error taxonomy**: 10 categories (insufficient_extension, over_flexion, timing_early/late, balance_loss, etc.) grounded in LucidAction penalties, USAG deductions, and real Habitude app data
|
| 27 |
+
- **2-3 output variations per input**: same divergence pattern gets different coaching language, breaking template memorization
|
| 28 |
+
- **50 gold-standard cues** as style anchors (hand-written by coaching framework)
|
| 29 |
+
- **Novel cue generation**: model composes cues it was never explicitly trained on
|
| 30 |
|
| 31 |
+
### Evidence of Generalization (v3)
|
| 32 |
+
v1/v2 produced verbatim copies of training data. v3 generates **novel coaching cues**:
|
| 33 |
+
- Input: torso divergence during takeoff
|
| 34 |
+
- Expected: "hips level, midline braced"
|
| 35 |
+
- Predicted: "hips over hands, arched bridge during the takeoff β you should feel hips pushing forward"
|
| 36 |
+
- Both are valid motor instructions β the model learned the *pattern*, not the template
|
| 37 |
|
| 38 |
+
## Pipeline
|
| 39 |
|
|
|
|
|
|
|
|
|
|
| 40 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 41 |
|
| 42 |
+
## Input Format
|
|
|
|
| 43 |
|
|
|
|
| 44 |
|
| 45 |
+
## Output Format
|
|
|
|
|
|
|
|
|
|
| 46 |
|
|
|
|
| 47 |
|
| 48 |
+
## Training Details
|
| 49 |
+
- **Base model**: google/gemma-3-4b-it (4-bit quantized via Unsloth)
|
| 50 |
+
- **Method**: LoRA (r=16, alpha=16, dropout=0)
|
| 51 |
+
- **Data**: 3,427 train / 371 val pairs from KIM VQ-VAE codec + directional error taxonomy
|
| 52 |
+
- **Training**: 3 epochs, batch size 8, lr 2e-4, A100 GPU, ~77 minutes
|
| 53 |
+
- **Best val loss**: 0.067 at step 1200
|
| 54 |
+
|
| 55 |
+
## Part of KIM (Kinematic Instruction Model)
|
| 56 |
+
- Codec: [antking1/KIM](https://huggingface.co/antking1/KIM)
|
| 57 |
+
- Coach: [antking1/KIM-coach](https://huggingface.co/antking1/KIM-coach) (this model)
|