| --- |
| license: openrail++ |
| language: |
| - en |
| base_model: |
| - camenduru/openpose |
| pipeline_tag: keypoint-detection |
| tags: |
| - skeleton |
| --- |
| # Lightweight Human Pose Estimation |
|
|
| Real-time multi-person 2D pose estimation using a MobileNet backbone with Part Affinity Fields (PAF). Includes a TSA security screening demo that detects persons in a defined zone and checks whether their hands are raised. |
|
|
| ## Overview |
|
|
| The model detects **18 keypoints** per person: |
|
|
| | Index | Keypoint | Index | Keypoint | |
| |-------|----------|-------|----------| |
| | 0 | nose | 9 | r_knee | |
| | 1 | neck | 10 | r_ank | |
| | 2 | r_sho | 11 | l_hip | |
| | 3 | r_elb | 12 | l_knee | |
| | 4 | r_wri | 13 | l_ank | |
| | 5 | l_sho | 14 | r_eye | |
| | 6 | l_elb | 15 | l_eye | |
| | 7 | l_wri | 16 | r_ear | |
| | 8 | r_hip | 17 | l_ear | |
|
|
| Keypoints are grouped into person instances using PAF vectors (19 connection pairs), allowing robust multi-person detection. |
|
|
| ## Project Structure |
|
|
| ``` |
| pose/ |
| βββ demo.py # Inference demo with TSA screening logic |
| βββ train.py # Training script |
| βββ val.py # COCO validation script |
| βββ requirements.txt |
| βββ models/ |
| β βββ with_mobilenet.py # PoseEstimationWithMobileNet architecture |
| β βββ checkpoint_iter_370000.pth # Pretrained checkpoint |
| βββ modules/ |
| β βββ keypoints.py # Keypoint extraction and grouping (PAF) |
| β βββ pose.py # Pose class with 18 keypoints, tracking |
| β βββ conv.py # Conv building blocks |
| β βββ loss.py # L2 loss |
| β βββ load_state.py # Checkpoint loading utilities |
| β βββ get_parameters.py # Parameter group helpers |
| β βββ one_euro_filter.py # Temporal smoothing filter |
| βββ datasets/ |
| β βββ coco.py # COCO train/val dataset loaders |
| β βββ transformations.py # Data augmentation pipeline |
| βββ scripts/ |
| β βββ prepare_train_labels.py # Convert COCO JSON to internal format |
| β βββ make_val_subset.py # Create validation subset |
| β βββ convert_to_onnx.py # Export model to ONNX |
| βββ TRAIN-ON-CUSTOM-DATASET.md # Guide for custom dataset training |
| ``` |
|
|
| ## Installation |
|
|
| ```bash |
| pip install -r requirements.txt |
| ``` |
|
|
| Requirements: `torch>=0.4.1`, `torchvision>=0.2.1`, `opencv-python>=3.4.0.14`, `numpy>=1.14.0`, `pycocotools==2.0.0`, `shapely` (for demo zone detection). |
|
|
| ## Running the Demo |
|
|
| The demo runs inference on images or video and overlays detected poses. It includes TSA screening logic: it checks whether a person's hips are inside a configurable polygon zone and whether their wrists are raised above their shoulders. |
|
|
| **On a video file:** |
| ```bash |
| python demo.py --checkpoint-path models/checkpoint_iter_370000.pth --video path/to/video.mp4 |
| ``` |
|
|
| **On images:** |
| ```bash |
| python demo.py --checkpoint-path models/checkpoint_iter_370000.pth --images path/to/image.jpg |
| ``` |
|
|
| **CPU-only:** |
| ```bash |
| python demo.py --checkpoint-path models/checkpoint_iter_370000.pth --video path/to/video.mp4 --cpu |
| ``` |
|
|
| | Argument | Default | Description | |
| |----------|---------|-------------| |
| | `--checkpoint-path` | required | Path to `.pth` checkpoint | |
| | `--height-size` | 256 | Network input height | |
| | `--video` | β | Path to video file or webcam id | |
| | `--images` | β | One or more image paths | |
| | `--cpu` | false | Run on CPU | |
| | `--track` | 1 | Enable pose ID tracking across frames | |
| | `--smooth` | 1 | Apply One Euro filter smoothing to keypoints | |
|
|
| Output is saved to `test.mp4`. The screening zone polygon is defined in `demo.py` and can be adjusted for your camera setup. |
|
|
| ## Validation |
|
|
| Evaluate against COCO keypoints annotations: |
|
|
| ```bash |
| python val.py \ |
| --labels path/to/val_labels.json \ |
| --images-folder path/to/val2017 \ |
| --checkpoint-path models/checkpoint_iter_370000.pth \ |
| --output-name detections.json |
| ``` |
|
|
| ## Training |
|
|
| **1. Prepare labels** (converts COCO annotation JSON to the internal format): |
|
|
| ```bash |
| python scripts/prepare_train_labels.py --labels path/to/annotations.json |
| ``` |
|
|
| **2. Create a validation subset** (optional): |
|
|
| ```bash |
| python scripts/make_val_subset.py --labels path/to/val_labels.json |
| ``` |
|
|
| **3. Start training:** |
|
|
| ```bash |
| python train.py \ |
| --prepared-train-labels prepared_train_labels.pkl \ |
| --train-images-folder path/to/train2017 \ |
| --val-labels path/to/val_labels.json \ |
| --val-images-folder path/to/val2017 \ |
| --checkpoint-path models/checkpoint_iter_370000.pth \ |
| --experiment-name my_experiment |
| ``` |
|
|
| Key training arguments: |
|
|
| | Argument | Default | Description | |
| |----------|---------|-------------| |
| | `--num-refinement-stages` | 1 | Number of PAF refinement stages | |
| | `--base-lr` | 4e-5 | Initial learning rate | |
| | `--batch-size` | 80 | Batch size | |
| | `--from-mobilenet` | false | Initialize from MobileNet weights only | |
| | `--weights-only` | false | Load weights but reset optimizer/scheduler | |
| | `--checkpoint-after` | 5000 | Save checkpoint every N iterations | |
| | `--val-after` | 5000 | Run validation every N iterations | |
|
|
| Checkpoints are saved to `<experiment-name>_checkpoints/`. |
|
|
| ## Export to ONNX |
|
|
| ```bash |
| python scripts/convert_to_onnx.py \ |
| --checkpoint-path models/checkpoint_iter_370000.pth \ |
| --output-name human-pose-estimation.onnx |
| ``` |
|
|
| The exported model expects input shape `[1, 3, 256, 456]` and produces four outputs: heatmaps and PAFs for each refinement stage. |
|
|
| ## Custom Dataset Training |
|
|
| See [TRAIN-ON-CUSTOM-DATASET.md](TRAIN-ON-CUSTOM-DATASET.md) for a full walkthrough covering: |
| - Dataset annotation format (COCO JSON) |
| - How to define keypoint pairs (`BODY_PARTS_KPT_IDS`) and PAF channel indices (`BODY_PARTS_PAF_IDS`) |
| - Code modifications required for a different number of keypoints or skeleton topology |