antking1 commited on
Commit
efe6eae
Β·
verified Β·
1 Parent(s): af90a04

Update model card for v3

Browse files
Files changed (1) hide show
  1. README.md +38 -43
README.md CHANGED
@@ -1,62 +1,57 @@
1
  ---
2
- license: apache-2.0
3
- base_model: google/gemma-3-4b-it
4
  tags:
5
- - coaching
6
  - gymnastics
7
- - movement-analysis
 
 
8
  - lora
9
- - unsloth
10
- - gemma
11
- datasets:
12
- - custom
13
- language:
14
- - en
15
- pipeline_tag: text-generation
16
  ---
17
 
18
- # KIM-coach: Gymnastics Coaching Language Model
19
 
20
- Fine-tuned **Gemma 3 4B** for generating coaching cues from movement divergence data.
21
 
22
- ## Model Details
23
 
24
- - **Base model**: google/gemma-3-4b-it (4-bit quantized via Unsloth)
25
- - **Fine-tuning**: LoRA (r=16), merged into base weights
26
- - **Training data**: 1,538 synthetic coaching pairs across 20 FineGym gymnastics classes
27
- - **Training**: 3 epochs, 525 steps, ~2h43m on A100
28
- - **Best val loss**: 0.140 (step 200)
29
-
30
- ## Part of the KIM Pipeline
31
 
32
- This model is the coaching language component of the **Kinematic Instruction Model (KIM)** pipeline:
 
 
 
 
33
 
34
- 1. **Tokenize** β€” VQ-VAE encodes skeletal motion into discrete tokens ([antking1/KIM](https://huggingface.co/antking1/KIM))
35
- 2. **Compare** β€” Token sequences are aligned and divergence is computed per body part
36
- 3. **Coach** β€” This model translates divergence data into natural language coaching cues
 
 
 
37
 
38
- ## Input Format
39
 
40
- ```
41
- ### Instruction:
42
- You are a gymnastics coach. Analyze the movement comparison data and provide specific coaching feedback.
43
 
44
- ### Input:
45
- Element: vault_handspring
46
- Overall divergence: 0.176
47
- Per-part divergence: torso=0.223, arms=0.195, head=0.149, legs=0.140
48
- Worst segments: legs frames 11-12 (0.905), head frames 3-4 (0.846)
49
 
50
- ### Response:
51
- ```
52
 
53
- ## Limitations
54
 
55
- - Coaching cues are currently **assessments** ("your arms need correction") rather than **motor instructions** ("squeeze your elbows to your ribs")
56
- - Element IDs may be hallucinated
57
- - Trained on synthetic data generated by the same pipeline β€” circular validation risk
58
- - V1 proof-of-concept; not yet validated by qualified coaches
59
 
60
- ## Citation
61
 
62
- Part of the Motis Research project β€” [motis.pro](https://motis.pro)
 
 
 
 
 
 
 
 
 
 
1
  ---
 
 
2
  tags:
 
3
  - gymnastics
4
+ - coaching
5
+ - motion-analysis
6
+ - gemma3
7
  - lora
8
+ - kim
9
+ license: apache-2.0
10
+ base_model: google/gemma-3-4b-it
 
 
 
 
11
  ---
12
 
13
+ # KIM-Coach v3 β€” Gymnastics Coaching LLM
14
 
15
+ Fine-tuned Gemma 3 4B for generating motor-instruction coaching cues from motion analysis data.
16
 
17
+ ## What Changed in v3
18
 
19
+ | Version | Training Pairs | Key Improvement | Val Loss |
20
+ |---------|---------------|-----------------|----------|
21
+ | v1 | 1,538 | First fine-tune, assessment-style templates | 0.140 β†’ 0.070 |
22
+ | v2 | 1,538 | Motor instruction templates (action verbs, feel cues) | 0.110 β†’ 0.070 |
23
+ | **v3** | **3,798** | **Directional error taxonomy + output diversity** | **0.110 β†’ 0.067** |
 
 
24
 
25
+ ### v3 Improvements
26
+ - **Directional error taxonomy**: 10 categories (insufficient_extension, over_flexion, timing_early/late, balance_loss, etc.) grounded in LucidAction penalties, USAG deductions, and real Habitude app data
27
+ - **2-3 output variations per input**: same divergence pattern gets different coaching language, breaking template memorization
28
+ - **50 gold-standard cues** as style anchors (hand-written by coaching framework)
29
+ - **Novel cue generation**: model composes cues it was never explicitly trained on
30
 
31
+ ### Evidence of Generalization (v3)
32
+ v1/v2 produced verbatim copies of training data. v3 generates **novel coaching cues**:
33
+ - Input: torso divergence during takeoff
34
+ - Expected: "hips level, midline braced"
35
+ - Predicted: "hips over hands, arched bridge during the takeoff β€” you should feel hips pushing forward"
36
+ - Both are valid motor instructions β€” the model learned the *pattern*, not the template
37
 
38
+ ## Pipeline
39
 
 
 
 
40
 
 
 
 
 
 
41
 
42
+ ## Input Format
 
43
 
 
44
 
45
+ ## Output Format
 
 
 
46
 
 
47
 
48
+ ## Training Details
49
+ - **Base model**: google/gemma-3-4b-it (4-bit quantized via Unsloth)
50
+ - **Method**: LoRA (r=16, alpha=16, dropout=0)
51
+ - **Data**: 3,427 train / 371 val pairs from KIM VQ-VAE codec + directional error taxonomy
52
+ - **Training**: 3 epochs, batch size 8, lr 2e-4, A100 GPU, ~77 minutes
53
+ - **Best val loss**: 0.067 at step 1200
54
+
55
+ ## Part of KIM (Kinematic Instruction Model)
56
+ - Codec: [antking1/KIM](https://huggingface.co/antking1/KIM)
57
+ - Coach: [antking1/KIM-coach](https://huggingface.co/antking1/KIM-coach) (this model)