SCHP โ€” Self-Correction Human Parsing (Pascal Person Part, 7 classes)

SCHP (Self-Correction for Human Parsing) is a state-of-the-art human parsing model based on a ResNet-101 backbone. This checkpoint is trained on the Pascal Person Part dataset and packaged for the ๐Ÿค— Transformers AutoModel API.

Original repository: PeikeLi/Self-Correction-Human-Parsing

Source image Segmentation result
demo demo-pascal

Use cases:

  • ๐Ÿƒ Body part segmentation โ€” segment coarse body regions (head, torso, arms, legs) for pose-aware applications
  • ๐ŸŽฎ Avatar rigging โ€” generate body part masks as a preprocessing step for AR/VR avatars
  • ๐Ÿฅ Medical / ergonomics โ€” coarse body region detection for posture analysis or wearable device placement
  • ๐Ÿ“ Body proportion estimation โ€” measure relative areas of body segments in 2D images

Dataset โ€” Pascal Person Part

Pascal Person Part is a single-person human parsing dataset with 3 000+ images focused on body part segmentation.

  • mIoU on Pascal Person Part validation: 71.46%
  • 7 coarse labels covering body regions

Labels

ID Label
0 Background
1 Head
2 Torso
3 Upper Arms
4 Lower Arms
5 Upper Legs
6 Lower Legs

Usage โ€” PyTorch

from transformers import AutoImageProcessor, AutoModelForSemanticSegmentation
from PIL import Image
import torch

model = AutoModelForSemanticSegmentation.from_pretrained("pirocheto/schp-pascal-7", trust_remote_code=True)
processor = AutoImageProcessor.from_pretrained("pirocheto/schp-pascal-7", trust_remote_code=True)

image = Image.open("photo.jpg").convert("RGB")
inputs = processor(images=image, return_tensors="pt")

with torch.no_grad():
    outputs = model(**inputs)

# outputs.logits         โ€” (1,  7, 512, 512) raw logits
# outputs.parsing_logits โ€” (1,  7, 512, 512) refined parsing logits
# outputs.edge_logits    โ€” (1,  1, 512, 512) edge prediction logits
seg_map = outputs.logits.argmax(dim=1).squeeze().numpy()  # (H, W), values in [0, 6]

Each pixel in seg_map is a label ID. To map IDs back to names:

id2label = model.config.id2label
print(id2label[1])  # โ†’ "Head"

Usage โ€” ONNX Runtime

Optimized ONNX files are available in the onnx/ folder of this repo:

File Size Notes
onnx/schp-pascal-7.onnx + .onnx.data ~257 MB FP32, dynamic batch
onnx/schp-pascal-7-int8-static.onnx ~66 MB INT8 static, 99.77% pixel agreement
import onnxruntime as ort
import numpy as np
from huggingface_hub import hf_hub_download
from transformers import AutoImageProcessor
from PIL import Image

model_path = hf_hub_download("pirocheto/schp-pascal-7", "onnx/schp-pascal-7-int8-static.onnx")
processor  = AutoImageProcessor.from_pretrained("pirocheto/schp-pascal-7", trust_remote_code=True)

sess_opts = ort.SessionOptions()
sess_opts.intra_op_num_threads = 8
sess = ort.InferenceSession(model_path, sess_opts, providers=["CPUExecutionProvider"])

image  = Image.open("photo.jpg").convert("RGB")
inputs = processor(images=image, return_tensors="np")
logits = sess.run(["logits"], {"pixel_values": inputs["pixel_values"]})[0]
seg_map = logits.argmax(axis=1).squeeze()  # (H, W)

Performance

Benchmarked on CPU (16-core, 8 ORT threads, intra_op_num_threads=8):

Backend Latency Speedup Size
PyTorch FP32 ~424 ms 1ร— 255 MB
ONNX FP32 ~296 ms 1.44ร— 256 MB
ONNX INT8 static ~218 ms 1.94ร— 66 MB

INT8 static quantization achieves 99.77% pixel-level agreement with the FP32 model.

Model Details

Property Value
Architecture ResNet-101 + SCHP self-correction
Input size 512 ร— 512
Output 3 heads: logits, parsing_logits, edge_logits
num_labels 7
Dataset Pascal Person Part
Original mIoU 71.46%
Downloads last month
69
Safetensors
Model size
66.8M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support