Spaces:

build-small-hackathon
/

small-functional-movement-screening

Running on Zero

App Files Files Community

small-functional-movement-screening / RECON.md

BladeSzaSza

fix: define REPO_NAME in hf_upload.sh (ensure_blade_space referenced it)

4948993 verified 20 days ago

preview code

Raw

History Blame Contribute Delete

2.61 kB

	# RECON.md

	Phase 0 reconnaissance findings — model verification, Gradio APIs, access status.
	Updated: June 4, 2026.

	## Gradio
	- Version: TBD (will verify on first `pip install gradio`)
	- gr.Blocks: expected ✓ (used in app.py skeleton)
	- gr.Video: expected ✓
	- gr.Walkthrough / gr.Step: TBD (verify in Phase 2)
	- gr.Navbar: TBD (verify in Phase 2)
	- UI approach: gr.Blocks + custom CSS/theme (escalate to Server only if needed)

	## Python
	- Python 3.13.9 (local dev)
	- pytest 9.0.2, numpy, opencv-python installed

	## Model Verification

	\| Model \| Params \| License \| GGUF \| ZeroGPU \| Status \|
	\|---\|---\|---\|---\|---\|---\|
	\| YOLO26l-Pose (primary) \| 0.026B \| AGPL-3.0 \| n/a \| ✓ (6.5ms T4) \| ready \|
	\| YOLO26x-Pose (HQ alt) \| 0.058B \| AGPL-3.0 \| n/a \| ✓ (12.2ms T4) \| ready \|
	\| SAM 3.1 base (sam2.1_hiera_base_plus) \| ~0.85B \| SAM License \| n/a \| ✓ \| access accepted \|
	\| SAM 3D Body (facebook/sam-3d-body-dinov3) \| 0.84B (DINOv3-H+) \| SAM License \| n/a \| ✓ \| INTEGRATED \|
	\| Sapiens2 Pose (noahcao/sapiens-pose-coco) \| ~0.6B \| CC-BY-NC-4.0 \| n/a \| ✓ \| access accepted \|
	\| ST-GCN (pyskl) \| ~0.03B \| Apache-2.0 \| n/a \| ✓ \| ready \|
	\| Qwen3-VL-8B-Instruct \| 8B \| Apache-2.0 \| ✓ \| llama.cpp \| ready \|
	\| Qwen3-VL-Embedding-8B \| 8B \| Apache-2.0 \| ✓ \| llama.cpp \| ready \|

	## Param Sum
	~17.63B — well under 32B limit.

	## Gated Access Status (as of Jun 4, 2026)
	- [x] SAM 3.1 (facebookresearch/sam3) — accepted
	- [x] SAM 3D Body (facebook/sam-3d-body-dinov3) — ACCEPTED (confirmed Jun 4)
	- [x] Sapiens2 Pose (noahcao/sapiens-pose-coco) — accepted

	## Open Questions
	- [ ] Confirm "≤32B" = summed vs per-model in Discord AMA
	- [ ] AGPL-3.0 YOLO OK for hackathon submission? (Likely yes for non-commercial demo)

	## llama.cpp Build Plan
	- CPU-only build first (avoids libcudart.so issues on Spaces)
	- Fallback: transformers + spaces.GPU for VLM inference
	- GGUF quantized Qwen3-VL-8B at Q4_K_M (~4.5GB)

	## Key Decisions
	- Primary pose: YOLO11x-Pose (fastest, well-tested)
	- Fallback pose: Sapiens2 (more keypoints, slower)
	- 3D body: INTEGRATED — uses `setup_sam_3d_body()` from `notebook.utils`, outputs MHR joints
	- API: `estimator.process_one_image(rgb_image)` — single RGB np.ndarray
	- Model variants: DINOv3-H+ (840M) default, ViT-H (631M) smaller
	- Temporal smoothing via EMA (alpha=0.3) to reduce single-frame jitter
	- config.enable_3d=False by default; flipped when checkpoint verified on Space
	- VLM: Qwen3-VL-8B via llama.cpp (Judge + Classifier)
	- Embeddings: Qwen3-VL-Embedding-8B via llama.cpp (Retrieval)