docs: acaua mirror model card with code+weights provenance
Browse files
README.md
ADDED
|
@@ -0,0 +1,93 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
library_name: acaua
|
| 4 |
+
pipeline_tag: keypoint-detection
|
| 5 |
+
tags:
|
| 6 |
+
- pose-estimation
|
| 7 |
+
- keypoint-detection
|
| 8 |
+
- vision
|
| 9 |
+
- acaua
|
| 10 |
+
- native-pytorch-port
|
| 11 |
+
datasets:
|
| 12 |
+
- coco
|
| 13 |
+
---
|
| 14 |
+
|
| 15 |
+
# UniFormer-S COCO top-down pose — acaua mirror (pure-PyTorch port)
|
| 16 |
+
|
| 17 |
+
Pure-PyTorch port of **UniFormer-S** (top-down COCO 17-keypoint pose at
|
| 18 |
+
256x192) hosted under `CondadosAI/` for use with the
|
| 19 |
+
[acaua](https://github.com/CondadosAI/acaua) computer vision library.
|
| 20 |
+
|
| 21 |
+
The architecture has been re-implemented in pure PyTorch under
|
| 22 |
+
`acaua.adapters.uniformer.pose` — no `mmcv`, no `mmengine`, no
|
| 23 |
+
`mmpose`, no `trust_remote_code`, no `timm` runtime dependency. The
|
| 24 |
+
backbone reuses `UniFormer2DDense` (already shipped via PR #9 for the
|
| 25 |
+
Stage 1.5 dense-prediction work); the pose head is a fresh port of
|
| 26 |
+
mmpose's `TopDownSimpleHead`. Decode is the upstream
|
| 27 |
+
`post_process='default'` path (argmax + 0.25-pixel shift, NOT the DARK
|
| 28 |
+
unbiased decoder). Inverse-warp uses the shared
|
| 29 |
+
`acaua.pose.topdown_utils` module (introduced in PR #8 ahead of this
|
| 30 |
+
stage).
|
| 31 |
+
|
| 32 |
+
## Provenance
|
| 33 |
+
|
| 34 |
+
| | |
|
| 35 |
+
|---|---|
|
| 36 |
+
| Upstream code | [`Sense-X/UniFormer`](https://github.com/Sense-X/UniFormer) @ `main` (Apache-2.0); files derived: `pose_estimation/mmpose/models/backbones/uniformer.py` (backbone, identical to detection variant up to module-class identity) + `pose_estimation/mmpose/models/keypoint_heads/top_down_simple_head.py` (head) |
|
| 37 |
+
| Upstream weights | Google Drive file id `162R0JuTpf3gpLe1IK6oxRoQK7JSj4ylx`, filename `top_down_256x192_global_small.pth` (101MB) |
|
| 38 |
+
| Upstream SHA256 | `d77059e3e9322c0e20dc89dc0cf2a583ffe2ced7d3e9b350233738add570bc30` |
|
| 39 |
+
| Upstream report | AP 74.0 / AP@50 90.3 / AP@75 82.2 on COCO val 2017, 256x192, single-scale |
|
| 40 |
+
| Architecture | UniFormer-S backbone (`hybrid=False, windows=False`, depth=[3,4,8,3], embed_dims=[64,128,320,512], head_dim=64) + TopDownSimpleHead (3x ConvTranspose2d-stride-2 + BN+ReLU upsample, 1x1 conv to 17 channels) |
|
| 41 |
+
| Total params | 25.23M (backbone 21.04M + head 4.19M) |
|
| 42 |
+
| Mirrored on | 2026-04-25 |
|
| 43 |
+
| Mirrored by | [CondadosAI/acaua](https://github.com/CondadosAI/acaua) |
|
| 44 |
+
|
| 45 |
+
## Usage via acaua
|
| 46 |
+
|
| 47 |
+
```python
|
| 48 |
+
import acaua
|
| 49 |
+
|
| 50 |
+
# MIT-declared weights -> explicit opt-in (same posture as RTMPose +
|
| 51 |
+
# UniFormer image / video classifications). The bundled RTMDet-tiny
|
| 52 |
+
# detector is loaded automatically from CondadosAI/rtmdet_t_coco.
|
| 53 |
+
model = acaua.Model.from_pretrained(
|
| 54 |
+
"CondadosAI/uniformer_s_coco_pose", allow_non_apache=True
|
| 55 |
+
)
|
| 56 |
+
|
| 57 |
+
result = model.predict("image.jpg")
|
| 58 |
+
print(result.keypoints.shape) # (N_persons, 17, 2)
|
| 59 |
+
print(result.keypoint_scores.shape) # (N_persons, 17)
|
| 60 |
+
|
| 61 |
+
# COCO skeleton edges are surfaced on the adapter:
|
| 62 |
+
import supervision as sv
|
| 63 |
+
sv.EdgeAnnotator(edges=model.skeleton).annotate(scene, result.to_supervision())
|
| 64 |
+
```
|
| 65 |
+
|
| 66 |
+
## Files in this mirror
|
| 67 |
+
|
| 68 |
+
- `model.safetensors` — full pose model weights (backbone + head, 352
|
| 69 |
+
tensors). Loaded under `load_state_dict(strict=True)` at adapter
|
| 70 |
+
init time.
|
| 71 |
+
- `config.json` — `acaua_task=pose`, COCO-17 `keypoint_names` +
|
| 72 |
+
`skeleton`, `detector_repo_id=CondadosAI/rtmdet_t_coco`. Adapter
|
| 73 |
+
surfaces these as `model.keypoint_names` / `model.skeleton`.
|
| 74 |
+
- `NOTICE` — attribution chain (code AND weights).
|
| 75 |
+
- `LICENSE` — Apache-2.0.
|
| 76 |
+
|
| 77 |
+
## License and attribution
|
| 78 |
+
|
| 79 |
+
The adapter code is redistributed under Apache-2.0. The underlying
|
| 80 |
+
weights carry upstream's MIT declaration (compatible). The acaua
|
| 81 |
+
UniFormer-pose adapter is a derivative work of the upstream PyTorch
|
| 82 |
+
implementation — see [`NOTICE`](./NOTICE) for the attribution chain.
|
| 83 |
+
|
| 84 |
+
## Citation
|
| 85 |
+
|
| 86 |
+
```bibtex
|
| 87 |
+
@inproceedings{li2022uniformer,
|
| 88 |
+
title = {UniFormer: Unifying Convolution and Self-attention for Visual Recognition},
|
| 89 |
+
author = {Li, Kunchang and Wang, Yali and Zhang, Junhao and Gao, Peng and Song, Guanglu and Liu, Yu and Li, Hongsheng and Qiao, Yu},
|
| 90 |
+
booktitle = {ICLR},
|
| 91 |
+
year = {2022},
|
| 92 |
+
}
|
| 93 |
+
```
|