A newer version of the Gradio SDK is available:
6.1.0
Gesture Recognition
Introduction
In this project, we present a skeleton based pipeline for gesture recognition. The pipeline is three-stage. The first stage consists of a hand detection module that outputs bounding boxes of human hands from video frames. Afterwards, the second stage employs a pose estimation module to generate keypoints of the detected hands. Finally, the third stage utilizes a skeleton-based gesture recognition module to classify hand actions based on the provided hand skeleton. The three-stage pipeline is lightweight and can achieve real-time on CPU devices. In this README, we provide the models and the inference demo for the project. Training data preparation and training scripts are described in TRAINING.md.
Hand detection stage
Hand detection results on OneHand10K validation dataset
| Config | Input Size | bbox mAP | bbox mAP 50 | bbox mAP 75 | ckpt | log |
|---|---|---|---|---|---|---|
| rtmdet_nano | 320x320 | 0.8100 | 0.9870 | 0.9190 | ckpt | log |
Pose estimation stage
Pose estimation results on COCO-WholeBody-Hand validation set
Gesture recognition stage
Skeleton base gesture recognition results on Jester validation