Instructions to use litert-community/yolox-tiny-litert with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LiteRT
How to use litert-community/yolox-tiny-litert with LiteRT:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
YOLOX-Tiny β LiteRT (CompiledModel GPU)
Megvii YOLOX-Tiny (COCO, Apache-2.0) re-authored to a GPU-native LiteRT .tflite via the
official litert_torch path (no onnx2tf). FP16, 10.4 MB, input 416Γ416.
Verified on a Pixel 8a: the whole graph runs on the GPU delegate (full LITERT_CL residency, zero CPU fallback) and the GPU output matches the CPU/PyTorch reference (corr β₯ 0.999).
Why this is GPU-clean
YOLOX is a pure CNN, but its Focus stem (stride-2 space-to-depth slicing) lowers to
GATHER_ND, which the GPU delegate rejects. Here the Focus + its following 3Γ3 conv are folded
into a single, numerically-exact 6Γ6 stride-2 conv, so the graph has zero GATHER/GATHER_ND/
TopK/Cast ops and no >4D tensors. Activations (SiLU) lower to LOGISTIC+MUL.
I/O
- Input
images[1, 416, 416, 3]NHWC, BGR, 0β255, no normalization (YOLOX letterbox: uniform-scale to fit, pad bottom/right with gray 114). - Output
[1, 3549, 85]raw heads, anchor-major. `85 = 4 box (cx,cy,w,h, grid units) + 1 obj- 80 class`. obj/class are already sigmoid'd; boxes are not decoded.
Host-side decode (kept out of the graph for GPU-cleanliness)
For anchor i at grid (gx,gy) with stride β {8,16,32}:
cx=(raw_cx+gx)*stride, cy=(raw_cy+gy)*stride, w=exp(raw_w)*stride, h=exp(raw_h)*stride;
score = obj * max_class; then per-class NMS. Divide boxes by the letterbox ratio to map back.
Reference Kotlin + Python decode in the sample below.
Performance
COCO val2017 AP 32.8 (FP32 reference). Real-time on Pixel 8a GPU.
Training data & PII
Trained by Megvii on COCO 2017 (train2017), a public academic object-detection dataset (Creative Commons). COCO images contain people as one of the 80 object categories; no names, identities, or other personal attributes are modeled or output β the model emits only class id + box. No additional or private data was used. Weights are the official Megvii release; only the op graph was re-authored for GPU (weights unchanged).
Sample app + conversion script
Android sample (CompiledModel GPU, Kotlin decode + NMS) and the litert_torch conversion script:
https://github.com/google-ai-edge/litert-samples (compiled_model_api/object_detection)
- Downloads last month
- -