Spaces:
Running on Zero
Running on Zero
| title: Pozify | |
| emoji: "🏋️" | |
| colorFrom: green | |
| colorTo: blue | |
| sdk: gradio | |
| sdk_version: "6.17.3" | |
| python_version: "3.10" | |
| app_file: app.py | |
| fullWidth: true | |
| short_description: Small-model workout form review from short videos. | |
| tags: | |
| - gradio | |
| - computer-vision | |
| - pose-estimation | |
| - fitness | |
| - video-analysis | |
| - llama-cpp | |
| - track:backyard | |
| - sponsor:openbmb | |
| - sponsor:openai | |
| - sponsor:nvidia | |
| - sponsor:modal | |
| - achievement:offgrid | |
| - achievement:welltuned | |
| - achievement:offbrand | |
| - achievement:llama | |
| - achievement:sharing | |
| - achievement:fieldnotes | |
| # Pozify | |
| Pozify is a small-model workout form coach for people who want to train at home but still need clear, | |
| trustworthy feedback. A user uploads a short exercise video, adds basic training context, and gets a | |
| structured form-review report with rep counts, movement notes, annotated video, and a grounded coach | |
| summary. | |
| Pozify is built for users who avoid gyms because they are far away, too crowded, intimidating, or too | |
| expensive to replace with a private trainer. It gives them a second set of eyes without pretending to | |
| be a clinician or a full personal coach. | |
| Pozify is not a medical device. It does not diagnose injuries, claim injury prevention, or replace a | |
| qualified trainer, clinician, or physical therapist. | |
|  | |
| ## What Pozify Delivers | |
| For each uploaded workout clip, Pozify produces: | |
| - detected exercise and confidence | |
| - rep-by-rep analysis | |
| - valid variation markers versus real form issues | |
| - annotated output video and issue clips | |
| - grounded coach summary with fixes and a next-session plan | |
| - verifier-backed confidence and safety notes | |
| The supported exercise labels are `squat`, `push_up`, `shoulder_press`, and `unknown`. The `unknown` | |
| label is intentional: Pozify should reject unsupported or unclear clips instead of forcing every | |
| video into one of the supported movements. | |
|  | |
| ## Product Flow | |
| Pozify is not a generic chatbot and not a vague video captioner. It is a grounded movement-analysis | |
| pipeline: | |
| ```mermaid | |
| flowchart TD | |
| A["video + user profile"] --> B["video QC"] | |
| B --> C["pose extraction"] | |
| C --> D["pose cleaning"] | |
| D --> E["exercise router"] | |
| E --> F["exercise-specific rep counter"] | |
| F --> G["per-rep analysis"] | |
| G --> H["variation detection"] | |
| H --> I["issue markers"] | |
| I --> J["annotated video renderer"] | |
| J --> K["grounded coach summary"] | |
| K --> L["verifier"] | |
| L --> M["final report"] | |
| classDef vision fill:#1e3a5f,stroke:#60a5fa,color:#fff | |
| classDef movement fill:#1e3f2e,stroke:#4ade80,color:#fff | |
| classDef output fill:#3b2f1e,stroke:#fbbf24,color:#fff | |
| class B,C,D vision | |
| class E,F,G,H,I movement | |
| class J,K,L,M output | |
| ``` | |
| The main product decision is simple: structured evidence first, language second. The language model | |
| does not inspect the raw video directly and invent advice. It explains the structured findings that | |
| the pipeline has already produced. | |
| ## Small-Model Strategy | |
| Pozify is built around the belief that small, task-specific models can be the right default for many | |
| real products. A small model does not need to act like a large general assistant if the product gives | |
| it a narrow job, clean inputs, and a verifier. | |
| That is the strategy here: | |
| - use pose and deterministic logic to extract evidence before generation | |
| - train small models on the exact task they must perform | |
| - keep each model boundary inspectable | |
| - retrieve exercise-specific knowledge cards instead of relying on generic memory | |
| - evaluate outputs against product contracts, not only fluency | |
| - fall back conservatively when model output is unavailable or ungrounded | |
| For narrow tasks such as exercise routing or structured JSON-to-coaching-summary generation, a | |
| fine-tuned small model can match or beat a much larger generic model on the product's actual quality | |
| bar. The advantage comes from being optimized for the exact schema, vocabulary, examples, and failure | |
| modes Pozify cares about. | |
| ## Models We Use | |
| Every runtime model used by Pozify fits under the Build Small `32B` parameter cap. | |
| | Component | Model or method | Role | | |
| | --------------------------- | ------------------------------------------------ | --------------------------------------------------------------------- | | |
| | Pose extraction | MediaPipe Pose Landmarker Lite | Extracts body landmarks from video frames. | | |
| | Exercise router | Custom PyTorch BiLSTM over 30-frame pose windows | Classifies `squat`, `push_up`, `shoulder_press`, or `unknown`. | | |
| | Router baseline | scikit-learn `HistGradientBoostingClassifier` | Reference and fallback artifact for router experiments. | | |
| | Rep counting | Exercise-specific state machines | Counts reps from movement signals without an LLM. | | |
| | Issue markers | Transparent rules over per-rep metrics | Separates valid variations from likely form issues. | | |
| | Coach-summary base | `nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16` | Base model for coach-summary LoRA SFT. | | |
| | Coach summary | `build-small-hackathon/pozify-coach-summary1` | Fine-tuned model for grounded structured coaching output. | | |
| | Local/off-grid summary path | Nemotron GGUF through `llama.cpp` | Optional local runtime path for small-model inference. | | |
| | Verifier | Deterministic grounding and safety checks | Blocks unsupported issues, diagnosis language, and ungrounded claims. | | |
| The trained exercise router is intentionally tiny: | |
| | Artifact | Count | | |
| | ------------------------------- | --------: | | |
| | BiLSTM router trainable params | 182,796 | | |
| | Router input features per frame | 237 | | |
| | Window length | 30 frames | | |
| | Output classes | 4 | | |
| ## Why It Fits Build Small | |
| Pozify matches `Backyard AI` because it solves an everyday problem with a personal, practical tool: | |
| affordable at-home workout feedback without needing a gym or private coach. | |
| It also matches the broader Build Small philosophy: | |
| - local-first and modular architecture | |
| - transparent model boundaries instead of one giant opaque model | |
| - per-component models under the `32B` limit | |
| - task-specific fine-tuning instead of relying only on a large general model | |
| - useful day-to-day workflow, not only a technical demo | |
| ## Hackathon Snapshot | |
| - Track: `Backyard AI` | |
| - Submission format: `Gradio Space` | |
| - Core user impact: affordable at-home workout feedback from short videos | |
| - Hugging Face Space: [build-small-hackathon/Pozify](https://huggingface.co/spaces/build-small-hackathon/Pozify) | |
| - Team repo: [tihado/Pozify](https://github.com/tihado/Pozify) | |
| - Router model repo: [build-small-hackathon/pozify-exercise-router](https://huggingface.co/build-small-hackathon/pozify-exercise-router) | |
| - Default coach-summary model: [build-small-hackathon/pozify-coach-summary-nemotron](https://huggingface.co/build-small-hackathon/pozify-coach-summary-nemotron) | |
| - Demo video: [Posify at build-small-hackathon 2026](https://www.youtube.com/watch?v=43Lx-Ia2B5U) | |
| - Social post: [Posify on Linkedin](https://www.linkedin.com/posts/buildsmallhackathon-huggingface-gradio-share-7472342687812186112-36ON/) | |
| - Blog post: [Pozify: Small-Model Workout Form Review from Short Videos](https://tihado.com/blogs/introducing-pozify-build-small-hackathon/) | |
| Primary sponsor tools used in this build: | |
| - `Hugging Face Spaces` for the app surface | |
| - `Hugging Face Inference` and local runtimes for small-model inference | |
| - `Modal` for training, evaluation, merging, and publishing workflows | |
| - `OpenAI Codex` for implementation support and iteration speed | |
| ## How We Used Codex | |
| We used OpenAI Codex as a repo-aware coding agent during the build. The useful pattern was not to | |
| ask for generic code, but to ask Codex to inspect the current project, propose the smallest useful | |
| change, edit the right files, run relevant checks, and summarize what changed. | |
| Codex was most helpful for: | |
| - turning product ideas into scoped engineering tasks | |
| - moving across Python pipeline code, web UI files, scripts, configs, tests, and docs | |
| - researching changing platform constraints and connecting them back to this repo | |
| - reviewing diffs for regressions, missing tests, grounding problems, and unsafe wording | |
| - keeping hackathon notes, training reports, and handoff docs in sync with implementation | |
| The detailed team note is in [docs/50-codex-development-workflow.md](docs/50-codex-development-workflow.md). | |
| For the official product docs, start with [OpenAI Codex docs](https://developers.openai.com/codex), | |
| [Codex quickstart](https://developers.openai.com/codex/quickstart), and | |
| [Codex prompting](https://developers.openai.com/codex/prompting). | |
| ## Docs | |
| The README is intentionally product-focused. Command-heavy setup, runtime, training, environment, and | |
| verification details live in the technical docs: | |
| - [Technical setup and runtime](docs/02-technical-setup.md) | |
| - [Hackathon overview report](docs/10-overview-build-small-hackathon-report.md) | |
| - [Router training report](docs/20-router-training-report.md) | |
| - [Router Hugging Face release guide](docs/21-router-huggingface-release.md) | |
| - [Coach summary Modal training](docs/30-coach-modal-training.md) | |
| - [Coach training report](docs/31-coach-training-report.md) | |
| - [Data collection guide](docs/40-data-custom-collection-guide.md) | |
| - [Docs index](docs/01-docs-index.md) | |
| ## Contributors | |
| - 🚀 [@nvti](https://github.com/nvti) | |
| - 🌿 [@honghanhh](https://github.com/honghanhh) | |
| - 🔧 [@NLag](https://github.com/NLag) | |
| - ✨ [pnhneee](https://github.com/ctpnheee) | |