RTMPose-Hand — LiteRT (on-device 21-keypoint hand pose, fully-GPU)

RTMPose (mmpose, CSPNeXt + RTMCC/SimCC head) hand pose, converted to LiteRT and running fully on the CompiledModel GPU (ML Drift) on Android. The 21 standard hand keypoints (wrist + 4 joints × 5 fingers) for a single centered hand.

On-device (Pixel 8a, Tensor G3 — verified)


nodes on GPU	333 / 333 LITERT_CL (full residency)
inference	~4 ms (256×256)
size	28 MB (fp16)
accuracy	device-vs-PyTorch SimCC corr 0.999, 21/21 keypoints

image[1,3,256,256] (ImageNet 0-255) →[GPU: CSPNeXt + RTMCC]→ simcc_x[1,21,512], simcc_y[1,21,512]

How it converts (litert-torch)

Identical RTMPose-family recipe (both numerically exact, no PixelShuffle since there's no neck):

ScaleNorm (RMS) → SafeRMSNorm — fp16-overflow all-zero-head fix (scale x down by S=64 before squaring).
GAU act@act BMM → broadcast-multiply + reduce-sum.

Result: banned ops NONE, all tensors ≤4D, tflite-vs-torch corr 1.0, device-vs-torch corr 0.999.

Preprocessing

Center-crop to square, resize to 256×256, ImageNet 0-255 normalize, NCHW. Top-down — one centered hand. SimCC argmax (÷ split=2) → pixel.

License

Apache-2.0. Upstream: open-mmlab/mmpose RTMPose-Hand.

Downloads last month: -

Inference Providers NEW

Keypoint Detection

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support