You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Lightweight Human Pose Estimation

Real-time multi-person 2D pose estimation using a MobileNet backbone with Part Affinity Fields (PAF). Includes a TSA security screening demo that detects persons in a defined zone and checks whether their hands are raised.

Overview

The model detects 18 keypoints per person:

Index	Keypoint	Index	Keypoint
0	nose	9	r_knee
1	neck	10	r_ank
2	r_sho	11	l_hip
3	r_elb	12	l_knee
4	r_wri	13	l_ank
5	l_sho	14	r_eye
6	l_elb	15	l_eye
7	l_wri	16	r_ear
8	r_hip	17	l_ear

Keypoints are grouped into person instances using PAF vectors (19 connection pairs), allowing robust multi-person detection.

Project Structure

pose/
├── demo.py                        # Inference demo with TSA screening logic
├── train.py                       # Training script
├── val.py                         # COCO validation script
├── requirements.txt
├── models/
│   ├── with_mobilenet.py          # PoseEstimationWithMobileNet architecture
│   └── checkpoint_iter_370000.pth # Pretrained checkpoint
├── modules/
│   ├── keypoints.py               # Keypoint extraction and grouping (PAF)
│   ├── pose.py                    # Pose class with 18 keypoints, tracking
│   ├── conv.py                    # Conv building blocks
│   ├── loss.py                    # L2 loss
│   ├── load_state.py              # Checkpoint loading utilities
│   ├── get_parameters.py          # Parameter group helpers
│   └── one_euro_filter.py         # Temporal smoothing filter
├── datasets/
│   ├── coco.py                    # COCO train/val dataset loaders
│   └── transformations.py         # Data augmentation pipeline
├── scripts/
│   ├── prepare_train_labels.py    # Convert COCO JSON to internal format
│   ├── make_val_subset.py         # Create validation subset
│   └── convert_to_onnx.py         # Export model to ONNX
└── TRAIN-ON-CUSTOM-DATASET.md     # Guide for custom dataset training

Installation

pip install -r requirements.txt

Requirements: torch>=0.4.1, torchvision>=0.2.1, opencv-python>=3.4.0.14, numpy>=1.14.0, pycocotools==2.0.0, shapely (for demo zone detection).

Running the Demo

The demo runs inference on images or video and overlays detected poses. It includes TSA screening logic: it checks whether a person's hips are inside a configurable polygon zone and whether their wrists are raised above their shoulders.

On a video file:

python demo.py --checkpoint-path models/checkpoint_iter_370000.pth --video path/to/video.mp4

On images:

python demo.py --checkpoint-path models/checkpoint_iter_370000.pth --images path/to/image.jpg

CPU-only:

python demo.py --checkpoint-path models/checkpoint_iter_370000.pth --video path/to/video.mp4 --cpu

Argument	Default	Description
`--checkpoint-path`	required	Path to `.pth` checkpoint
`--height-size`	256	Network input height
`--video`	—	Path to video file or webcam id
`--images`	—	One or more image paths
`--cpu`	false	Run on CPU
`--track`	1	Enable pose ID tracking across frames
`--smooth`	1	Apply One Euro filter smoothing to keypoints

Output is saved to test.mp4. The screening zone polygon is defined in demo.py and can be adjusted for your camera setup.

Validation

Evaluate against COCO keypoints annotations:

python val.py \
  --labels path/to/val_labels.json \
  --images-folder path/to/val2017 \
  --checkpoint-path models/checkpoint_iter_370000.pth \
  --output-name detections.json

Training

1. Prepare labels (converts COCO annotation JSON to the internal format):

python scripts/prepare_train_labels.py --labels path/to/annotations.json

2. Create a validation subset (optional):

python scripts/make_val_subset.py --labels path/to/val_labels.json

3. Start training:

python train.py \
  --prepared-train-labels prepared_train_labels.pkl \
  --train-images-folder path/to/train2017 \
  --val-labels path/to/val_labels.json \
  --val-images-folder path/to/val2017 \
  --checkpoint-path models/checkpoint_iter_370000.pth \
  --experiment-name my_experiment

Key training arguments:

Argument	Default	Description
`--num-refinement-stages`	1	Number of PAF refinement stages
`--base-lr`	4e-5	Initial learning rate
`--batch-size`	80	Batch size
`--from-mobilenet`	false	Initialize from MobileNet weights only
`--weights-only`	false	Load weights but reset optimizer/scheduler
`--checkpoint-after`	5000	Save checkpoint every N iterations
`--val-after`	5000	Run validation every N iterations

Checkpoints are saved to <experiment-name>_checkpoints/.

Export to ONNX

python scripts/convert_to_onnx.py \
  --checkpoint-path models/checkpoint_iter_370000.pth \
  --output-name human-pose-estimation.onnx

The exported model expects input shape [1, 3, 256, 456] and produces four outputs: heatmaps and PAFs for each refinement stage.

Custom Dataset Training

See TRAIN-ON-CUSTOM-DATASET.md for a full walkthrough covering:

Dataset annotation format (COCO JSON)
How to define keypoint pairs (BODY_PARTS_KPT_IDS) and PAF channel indices (BODY_PARTS_PAF_IDS)
Code modifications required for a different number of keypoints or skeleton topology

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Keypoint Detection

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Sharath33/POSE

Base model

camenduru/openpose

Finetuned

(1)

this model

Collection including Sharath33/POSE

Sharath's Model Zoo

Collection

All the models I have contributed to so far • 5 items • Updated Apr 7