naifenn
/

motivate-pose-encoder

TensorBoard

Model card Files Files and versions

xet

Metrics Training metrics Community

naifenn commited on Mar 15

Commit

3ac4030

verified ·

1 Parent(s): 1b22c25

Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +55 -0

README.md ADDED Viewed

	@@ -0,0 +1,55 @@

+# Pose Encoder
+Trains a shared adapter on top of a frozen [PoseFormerV2](https://github.com/QitaoZhao/PoseFormerV2) encoder for two tasks:
+1. **Metric rating** — `Good` / `Okay` / `Needs work` per exercise metric
+2. **Exercise classification** — which exercise is being performed
+The trained adapter weights feed into the multimodal MotiVate pipeline.
+## Pipeline
+```
+MediaPipe 2D
+→ H36M remap + pad/crop + normalize
+→ PoseFormerV2 (frozen, 27×544)
+→ Shared Adapter (trainable, 27×256)
+→ mean pool
+→ Rating Head + Exercise Head
+Loss = rating_loss + exercise_loss_weight × exercise_loss
+```
+## Data Sources
+| Source | Path | Used for |
+|---|---|---|
+| `training.csv` / `validation.csv` | `train/unimodal/` | **Train/val split only** — provides `(dataset, clip_id)` pairs. No labels are read from these CSVs. |
+| `pose_data.npz` | `processed/<dataset>/<clip_id>/mediapipe_result/` | **Model input** — MediaPipe 2D pose landmarks per frame. |
+| `raw_gt.csv` | `processed/<dataset>/<clip_id>/pose/` | **Rating labels** — raw pose measurements. Fed to `compute_metrics()` at load time to produce `Good` / `Okay` / `Needs work` per metric. |
+| `metadata.json` | `processed/<dataset>/` | **Exercise labels** — maps each `clip_id` to its exercise name (fallback: infer from `clip_id` prefix). |
+Labels are **not** pre-computed in the CSVs. They are derived on the fly:
+- **Rating targets**: `raw_gt.csv` → `compute_metrics(df, exercise)` → `evaluate_rating()` → one of `Good`(0) / `Okay`(1) / `Needs work`(2) per metric. Metrics that can't be evaluated get `IGNORE_INDEX = -100` and are excluded from the loss.
+- **Exercise targets**: looked up from `metadata.json` and mapped to a class index via `EXERCISE_TO_ID`.
+## Setup
+```bash
+cd MotiVate/train
+bash setup_poseformer.sh          # clone + download checkpoint
+bash setup_poseformer.sh --force  # re-clone
+```
+## Training
+```bash
+uv run python train/pose_encoder/train_shared_adapter.py --config train/pose_encoder/config.json
+```
+## Outputs
+Saved to `checkpoints/shared_adapter/`:
+- `best_shared_adapter.pt` — adapter weights for the multimodal pipeline
+- `best_model.pt` / `last_model.pt` — full checkpoints
+`val_score = 0.5 × (rating_acc + exercise_acc)` — used for scheduling, early stopping, and best checkpoint selection.