lightweight-OpenPose — LiteRT (TFLite) GPU, FP16

On-device LiteRT (.tflite) conversion of lightweight-OpenPose for human pose estimation. The model is a MobileNet-based heatmap network; it outputs keypoint heatmaps only and the keypoint decode (argmax) is done in app code.

The model runs fully on the LiteRT CompiledModel GPU accelerator (ML Drift): every op is GPU-native, no CPU fallback. Converted with litert-torch with no patches.

Why heatmaps-only: MoveNet's official .tflite bakes the keypoint decode into the graph (GATHER_ND), which the GPU delegate can't run — so it only partially offloads to the GPU. Keeping the graph pure-conv and decoding in app code keeps it 100% on the GPU.

Files

File	Precision	Size
`pose_256_fp16.tflite`	fp16 weights	~8.3 MB
`pose_256.tflite`	fp32	~16.4 MB

I/O

Input: [1, 256, 256, 3] float32, NHWC, RGB, normalized (px - 128) / 256.
Output: [1, 32, 32, 19] float32, NHWC, keypoint heatmaps (18 body keypoints + background). Argmax each of the 18 keypoint channels over the 32 x 32 grid to get the normalized keypoint locations; connect them into a skeleton.

Keypoint order (18): nose, neck, r-shoulder, r-elbow, r-wrist, l-shoulder, l-elbow, l-wrist, r-hip, r-knee, r-ankle, l-hip, l-knee, l-ankle, r-eye, l-eye, r-ear, l-ear.

Ops

CONV_2D x41, DEPTHWISE_CONV_2D x14, TRANSPOSE x14, EXP x6, SUB x6,
GREATER_EQUAL x6, SELECT x6, ADD x6, PAD x3, CONCATENATION x1

(The ELU activations lower to EXP/SUB/GREATER_EQUAL/SELECT, all GPU-supported.) No GATHER_ND, no Flex/Custom.

On-device (Pixel 8a, verified)

The fp16 model compiles to 158 / 158 nodes on the LiteRT GPU delegate (LITERT_CL) — full GPU residency, no CPU fallback.

Usage (Android, LiteRT CompiledModel)

val model = CompiledModel.create(
    context.assets, "pose_256_fp16.tflite",
    CompiledModel.Options(Accelerator.GPU), null
)
val inputs = model.createInputBuffers()
val outputs = model.createOutputBuffers()
inputs[0].writeFloat(rgbNormalized)    // [1,256,256,3], (px-128)/256
model.run(inputs, outputs)
val heatmaps = outputs[0].readFloat()  // [1,32,32,19] -> argmax per keypoint in app code

A complete Android sample (camera + gallery, skeleton overlay) is available in google-ai-edge/litert-samples.

License & attribution

License: Apache-2.0. Weights/model from Daniil-Osokin/lightweight-human-pose-estimation.pytorch. Based on "Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose" (Osokin, 2018). Format conversion only; all credit to the original authors.

Downloads last month: 23

Inference Providers NEW

Keypoint Detection

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support