Instructions to use litert-community/RTMPose-Face-WFLW-LiteRT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LiteRT
How to use litert-community/RTMPose-Face-WFLW-LiteRT with LiteRT:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
RTMPose-Face (WFLW) β LiteRT (on-device 98-point face alignment, fully-GPU)
RTMPose (mmpose) face alignment, trained on WFLW, converted to
LiteRT and running fully on the CompiledModel GPU (ML Drift) on Android. 98 dense facial landmarks
(contour, eyebrows, eyes, nose, mouth, pupils) β the dense complement to a 5-point face detector.
On-device (Pixel 8a, Tensor G3 β verified)
| nodes on GPU | 333 / 333 LITERT_CL (full residency) |
| inference | ~4 ms (256Γ256) |
| size | 33.6 MB (fp16) |
| accuracy | device-vs-PyTorch SimCC corr 0.9995, 98 landmarks |
face[1,3,256,256] (mmpose mean/std) β[GPU: RTMPose-m]β simcc_x[1,98,512], simcc_y[1,98,512]
output[0] = simcc_x, output[1] = simcc_y; each landmark = argmax over its 1D SimCC (bins = pixels Γ 2).
Minimal usage
Android (Kotlin, CompiledModel GPU)
val model = CompiledModel.create(context.assets, "rtm_face_fp16.tflite",
CompiledModel.Options(Accelerator.GPU), null)
val inputs = model.createInputBuffers()
val outputs = model.createOutputBuffers()
inputs[0].writeFloat(chw) // [1,3,256,256] mmpose mean/std (0-255 RGB), NCHW
model.run(inputs, outputs)
val simccX = outputs[0].readFloat() // [1,98,512]
val simccY = outputs[1].readFloat() // [1,98,512]; keypoint = argmax / 2
Python (desktop verification)
MEAN = np.array([123.675, 116.28, 103.53], np.float32)
STD = np.array([58.395, 57.12, 57.375], np.float32)
import numpy as np
from PIL import Image
from ai_edge_litert.interpreter import Interpreter
img = Image.open("face.jpg").convert("RGB").resize((256, 256)) # centered subject crop
x = ((np.asarray(img, np.float32) - MEAN) / STD).transpose(2, 0, 1)[None]
it = Interpreter(model_path="rtm_face_fp16.tflite"); it.allocate_tensors()
it.set_tensor(it.get_input_details()[0]["index"], x); it.invoke()
od = it.get_output_details() # output 0 = simcc_x, 1 = simcc_y
sx = it.get_tensor(od[0]["index"])[0] # simcc_x [98,512]
sy = it.get_tensor(od[1]["index"])[0] # simcc_y [98,512]
kx, ky = sx.argmax(-1) / 2.0, sy.argmax(-1) / 2.0 # 98 keypoints, px in 256x256
for i, (a, b) in enumerate(zip(kx, ky)):
print(f"kp{i}: ({a:.1f}, {b:.1f})")
How it converts (litert-torch) β the RTMPose recipe, unchanged
Same model family as the human-pose RTMPose; only the config/checkpoint change to WFLW. The two on-device-only
Mali fixes transfer without modification: ScaleNorm β SafeRMSNorm and GAU act@act BMM β
broadcast-reduce. banned ops NONE, β€4D, tflite-vs-torch corr 1.0, device-vs-torch 0.9995.
Preprocessing
Center-crop to a (centered) face, resize 256Γ256, mmpose mean/std (RGB, 0-255 scale), NCHW.
License
Apache-2.0. Upstream: open-mmlab/mmpose; dataset WFLW.
- Downloads last month
- 7
