File size: 4,322 Bytes
e97480b 5590873 73f8017 e97480b | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 | ---
language: en
license: mit
tags:
- vision
- image-segmentation
- semantic-segmentation
- human-parsing
- body-parts
- pytorch
- onnx
datasets:
- pascal-person-part
pipeline_tag: image-segmentation
---
# SCHP โ Self-Correction Human Parsing (Pascal Person Part, 7 classes)
**SCHP** (Self-Correction for Human Parsing) is a state-of-the-art human parsing model based on a ResNet-101 backbone.
This checkpoint is trained on the **Pascal Person Part** dataset and packaged for the ๐ค Transformers `AutoModel` API.
> Original repository: [PeikeLi/Self-Correction-Human-Parsing](https://github.com/PeikeLi/Self-Correction-Human-Parsing)
| Source image | Segmentation result |
|:---:|:---:|
|  |  |
**Use cases:**
- ๐ **Body part segmentation** โ segment coarse body regions (head, torso, arms, legs) for pose-aware applications
- ๐ฎ **Avatar rigging** โ generate body part masks as a preprocessing step for AR/VR avatars
- ๐ฅ **Medical / ergonomics** โ coarse body region detection for posture analysis or wearable device placement
- ๐ **Body proportion estimation** โ measure relative areas of body segments in 2D images
## Dataset โ Pascal Person Part
Pascal Person Part is a single-person human parsing dataset with 3 000+ images focused on **body part segmentation**.
- **mIoU on Pascal Person Part validation: 71.46%**
- 7 coarse labels covering body regions
## Labels
| ID | Label |
|----|-------|
| 0 | Background |
| 1 | Head |
| 2 | Torso |
| 3 | Upper Arms |
| 4 | Lower Arms |
| 5 | Upper Legs |
| 6 | Lower Legs |
## Usage โ PyTorch
```python
from transformers import AutoImageProcessor, AutoModelForSemanticSegmentation
from PIL import Image
import torch
model = AutoModelForSemanticSegmentation.from_pretrained("pirocheto/schp-pascal-7", trust_remote_code=True)
processor = AutoImageProcessor.from_pretrained("pirocheto/schp-pascal-7", trust_remote_code=True)
image = Image.open("photo.jpg").convert("RGB")
inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
# outputs.logits โ (1, 7, 512, 512) raw logits
# outputs.parsing_logits โ (1, 7, 512, 512) refined parsing logits
# outputs.edge_logits โ (1, 1, 512, 512) edge prediction logits
seg_map = outputs.logits.argmax(dim=1).squeeze().numpy() # (H, W), values in [0, 6]
```
Each pixel in `seg_map` is a label ID. To map IDs back to names:
```python
id2label = model.config.id2label
print(id2label[1]) # โ "Head"
```
## Usage โ ONNX Runtime
Optimized ONNX files are available in the `onnx/` folder of this repo:
| File | Size | Notes |
|------|------|-------|
| `onnx/schp-pascal-7.onnx` + `.onnx.data` | ~257 MB | FP32, dynamic batch |
| `onnx/schp-pascal-7-int8-static.onnx` | ~66 MB | INT8 static, 99.77% pixel agreement |
```python
import onnxruntime as ort
import numpy as np
from huggingface_hub import hf_hub_download
from transformers import AutoImageProcessor
from PIL import Image
model_path = hf_hub_download("pirocheto/schp-pascal-7", "onnx/schp-pascal-7-int8-static.onnx")
processor = AutoImageProcessor.from_pretrained("pirocheto/schp-pascal-7", trust_remote_code=True)
sess_opts = ort.SessionOptions()
sess_opts.intra_op_num_threads = 8
sess = ort.InferenceSession(model_path, sess_opts, providers=["CPUExecutionProvider"])
image = Image.open("photo.jpg").convert("RGB")
inputs = processor(images=image, return_tensors="np")
logits = sess.run(["logits"], {"pixel_values": inputs["pixel_values"]})[0]
seg_map = logits.argmax(axis=1).squeeze() # (H, W)
```
## Performance
Benchmarked on CPU (16-core, 8 ORT threads, `intra_op_num_threads=8`):
| Backend | Latency | Speedup | Size |
|---------|---------|---------|------|
| PyTorch FP32 | ~424 ms | 1ร | 255 MB |
| ONNX FP32 | ~296 ms | 1.44ร | 256 MB |
| ONNX INT8 static | ~218 ms | **1.94ร** | **66 MB** |
INT8 static quantization achieves **99.77% pixel-level agreement** with the FP32 model.
## Model Details
| Property | Value |
|----------|-------|
| Architecture | ResNet-101 + SCHP self-correction |
| Input size | 512 ร 512 |
| Output | 3 heads: logits, parsing_logits, edge_logits |
| num_labels | 7 |
| Dataset | Pascal Person Part |
| Original mIoU | 71.46% |
|