| --- |
| license: mit |
| library_name: acaua |
| pipeline_tag: keypoint-detection |
| tags: |
| - pose-estimation |
| - keypoint-detection |
| - vision |
| - acaua |
| - native-pytorch-port |
| datasets: |
| - coco |
| --- |
| |
| # UniFormer-S COCO top-down pose β acaua mirror (pure-PyTorch port) |
|
|
| Pure-PyTorch port of **UniFormer-S** (top-down COCO 17-keypoint pose at |
| 256x192) hosted under `CondadosAI/` for use with the |
| [acaua](https://github.com/CondadosAI/acaua) computer vision library. |
|
|
| The architecture has been re-implemented in pure PyTorch under |
| `acaua.adapters.uniformer.pose` β no `mmcv`, no `mmengine`, no |
| `mmpose`, no `trust_remote_code`, no `timm` runtime dependency. The |
| backbone reuses `UniFormer2DDense` (already shipped via PR #9 for the |
| Stage 1.5 dense-prediction work); the pose head is a fresh port of |
| mmpose's `TopDownSimpleHead`. Decode is the upstream |
| `post_process='default'` path (argmax + 0.25-pixel shift, NOT the DARK |
| unbiased decoder). Inverse-warp uses the shared |
| `acaua.pose.topdown_utils` module (introduced in PR #8 ahead of this |
| stage). |
|
|
| ## Provenance |
|
|
| | | | |
| |---|---| |
| | Upstream code | [`Sense-X/UniFormer`](https://github.com/Sense-X/UniFormer) @ `main` (Apache-2.0); files derived: `pose_estimation/mmpose/models/backbones/uniformer.py` (backbone, identical to detection variant up to module-class identity) + `pose_estimation/mmpose/models/keypoint_heads/top_down_simple_head.py` (head) | |
| | Upstream weights | Google Drive file id `162R0JuTpf3gpLe1IK6oxRoQK7JSj4ylx`, filename `top_down_256x192_global_small.pth` (101MB) | |
| | Upstream SHA256 | `d77059e3e9322c0e20dc89dc0cf2a583ffe2ced7d3e9b350233738add570bc30` | |
| | Upstream report | AP 74.0 / AP@50 90.3 / AP@75 82.2 on COCO val 2017, 256x192, single-scale | |
| | Architecture | UniFormer-S backbone (`hybrid=False, windows=False`, depth=[3,4,8,3], embed_dims=[64,128,320,512], head_dim=64) + TopDownSimpleHead (3x ConvTranspose2d-stride-2 + BN+ReLU upsample, 1x1 conv to 17 channels) | |
| | Total params | 25.23M (backbone 21.04M + head 4.19M) | |
| | Mirrored on | 2026-04-25 | |
| | Mirrored by | [CondadosAI/acaua](https://github.com/CondadosAI/acaua) | |
|
|
| ## Usage via acaua |
|
|
| ```python |
| import acaua |
| |
| # MIT-declared weights -> explicit opt-in (same posture as RTMPose + |
| # UniFormer image / video classifications). The bundled RTMDet-tiny |
| # detector is loaded automatically from CondadosAI/rtmdet_t_coco. |
| model = acaua.Model.from_pretrained( |
| "CondadosAI/uniformer_s_coco_pose", allow_non_apache=True |
| ) |
| |
| result = model.predict("image.jpg") |
| print(result.keypoints.shape) # (N_persons, 17, 2) |
| print(result.keypoint_scores.shape) # (N_persons, 17) |
| |
| # COCO skeleton edges are surfaced on the adapter: |
| import supervision as sv |
| sv.EdgeAnnotator(edges=model.skeleton).annotate(scene, result.to_supervision()) |
| ``` |
|
|
| ## Files in this mirror |
|
|
| - `model.safetensors` β full pose model weights (backbone + head, 352 |
| tensors). Loaded under `load_state_dict(strict=True)` at adapter |
| init time. |
| - `config.json` β `acaua_task=pose`, COCO-17 `keypoint_names` + |
| `skeleton`, `detector_repo_id=CondadosAI/rtmdet_t_coco`. Adapter |
| surfaces these as `model.keypoint_names` / `model.skeleton`. |
| - `NOTICE` β attribution chain (code AND weights). |
| - `LICENSE` β Apache-2.0. |
|
|
| ## License and attribution |
|
|
| The adapter code is redistributed under Apache-2.0. The underlying |
| weights carry upstream's MIT declaration (compatible). The acaua |
| UniFormer-pose adapter is a derivative work of the upstream PyTorch |
| implementation β see [`NOTICE`](./NOTICE) for the attribution chain. |
|
|
| ## Citation |
|
|
| ```bibtex |
| @inproceedings{li2022uniformer, |
| title = {UniFormer: Unifying Convolution and Self-attention for Visual Recognition}, |
| author = {Li, Kunchang and Wang, Yali and Zhang, Junhao and Gao, Peng and Song, Guanglu and Liu, Yu and Li, Hongsheng and Qiao, Yu}, |
| booktitle = {ICLR}, |
| year = {2022}, |
| } |
| ``` |
|
|