POSE / README.md
Sharath33's picture
Update README.md
afb2ad8 verified
---
license: openrail++
language:
- en
base_model:
- camenduru/openpose
pipeline_tag: keypoint-detection
tags:
- skeleton
---
# Lightweight Human Pose Estimation
Real-time multi-person 2D pose estimation using a MobileNet backbone with Part Affinity Fields (PAF). Includes a TSA security screening demo that detects persons in a defined zone and checks whether their hands are raised.
## Overview
The model detects **18 keypoints** per person:
| Index | Keypoint | Index | Keypoint |
|-------|----------|-------|----------|
| 0 | nose | 9 | r_knee |
| 1 | neck | 10 | r_ank |
| 2 | r_sho | 11 | l_hip |
| 3 | r_elb | 12 | l_knee |
| 4 | r_wri | 13 | l_ank |
| 5 | l_sho | 14 | r_eye |
| 6 | l_elb | 15 | l_eye |
| 7 | l_wri | 16 | r_ear |
| 8 | r_hip | 17 | l_ear |
Keypoints are grouped into person instances using PAF vectors (19 connection pairs), allowing robust multi-person detection.
## Project Structure
```
pose/
β”œβ”€β”€ demo.py # Inference demo with TSA screening logic
β”œβ”€β”€ train.py # Training script
β”œβ”€β”€ val.py # COCO validation script
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ models/
β”‚ β”œβ”€β”€ with_mobilenet.py # PoseEstimationWithMobileNet architecture
β”‚ └── checkpoint_iter_370000.pth # Pretrained checkpoint
β”œβ”€β”€ modules/
β”‚ β”œβ”€β”€ keypoints.py # Keypoint extraction and grouping (PAF)
β”‚ β”œβ”€β”€ pose.py # Pose class with 18 keypoints, tracking
β”‚ β”œβ”€β”€ conv.py # Conv building blocks
β”‚ β”œβ”€β”€ loss.py # L2 loss
β”‚ β”œβ”€β”€ load_state.py # Checkpoint loading utilities
β”‚ β”œβ”€β”€ get_parameters.py # Parameter group helpers
β”‚ └── one_euro_filter.py # Temporal smoothing filter
β”œβ”€β”€ datasets/
β”‚ β”œβ”€β”€ coco.py # COCO train/val dataset loaders
β”‚ └── transformations.py # Data augmentation pipeline
β”œβ”€β”€ scripts/
β”‚ β”œβ”€β”€ prepare_train_labels.py # Convert COCO JSON to internal format
β”‚ β”œβ”€β”€ make_val_subset.py # Create validation subset
β”‚ └── convert_to_onnx.py # Export model to ONNX
└── TRAIN-ON-CUSTOM-DATASET.md # Guide for custom dataset training
```
## Installation
```bash
pip install -r requirements.txt
```
Requirements: `torch>=0.4.1`, `torchvision>=0.2.1`, `opencv-python>=3.4.0.14`, `numpy>=1.14.0`, `pycocotools==2.0.0`, `shapely` (for demo zone detection).
## Running the Demo
The demo runs inference on images or video and overlays detected poses. It includes TSA screening logic: it checks whether a person's hips are inside a configurable polygon zone and whether their wrists are raised above their shoulders.
**On a video file:**
```bash
python demo.py --checkpoint-path models/checkpoint_iter_370000.pth --video path/to/video.mp4
```
**On images:**
```bash
python demo.py --checkpoint-path models/checkpoint_iter_370000.pth --images path/to/image.jpg
```
**CPU-only:**
```bash
python demo.py --checkpoint-path models/checkpoint_iter_370000.pth --video path/to/video.mp4 --cpu
```
| Argument | Default | Description |
|----------|---------|-------------|
| `--checkpoint-path` | required | Path to `.pth` checkpoint |
| `--height-size` | 256 | Network input height |
| `--video` | β€” | Path to video file or webcam id |
| `--images` | β€” | One or more image paths |
| `--cpu` | false | Run on CPU |
| `--track` | 1 | Enable pose ID tracking across frames |
| `--smooth` | 1 | Apply One Euro filter smoothing to keypoints |
Output is saved to `test.mp4`. The screening zone polygon is defined in `demo.py` and can be adjusted for your camera setup.
## Validation
Evaluate against COCO keypoints annotations:
```bash
python val.py \
--labels path/to/val_labels.json \
--images-folder path/to/val2017 \
--checkpoint-path models/checkpoint_iter_370000.pth \
--output-name detections.json
```
## Training
**1. Prepare labels** (converts COCO annotation JSON to the internal format):
```bash
python scripts/prepare_train_labels.py --labels path/to/annotations.json
```
**2. Create a validation subset** (optional):
```bash
python scripts/make_val_subset.py --labels path/to/val_labels.json
```
**3. Start training:**
```bash
python train.py \
--prepared-train-labels prepared_train_labels.pkl \
--train-images-folder path/to/train2017 \
--val-labels path/to/val_labels.json \
--val-images-folder path/to/val2017 \
--checkpoint-path models/checkpoint_iter_370000.pth \
--experiment-name my_experiment
```
Key training arguments:
| Argument | Default | Description |
|----------|---------|-------------|
| `--num-refinement-stages` | 1 | Number of PAF refinement stages |
| `--base-lr` | 4e-5 | Initial learning rate |
| `--batch-size` | 80 | Batch size |
| `--from-mobilenet` | false | Initialize from MobileNet weights only |
| `--weights-only` | false | Load weights but reset optimizer/scheduler |
| `--checkpoint-after` | 5000 | Save checkpoint every N iterations |
| `--val-after` | 5000 | Run validation every N iterations |
Checkpoints are saved to `<experiment-name>_checkpoints/`.
## Export to ONNX
```bash
python scripts/convert_to_onnx.py \
--checkpoint-path models/checkpoint_iter_370000.pth \
--output-name human-pose-estimation.onnx
```
The exported model expects input shape `[1, 3, 256, 456]` and produces four outputs: heatmaps and PAFs for each refinement stage.
## Custom Dataset Training
See [TRAIN-ON-CUSTOM-DATASET.md](TRAIN-ON-CUSTOM-DATASET.md) for a full walkthrough covering:
- Dataset annotation format (COCO JSON)
- How to define keypoint pairs (`BODY_PARTS_KPT_IDS`) and PAF channel indices (`BODY_PARTS_PAF_IDS`)
- Code modifications required for a different number of keypoints or skeleton topology