--- language: en license: mit tags: - vision - image-segmentation - semantic-segmentation - human-parsing - body-parts - pytorch - onnx datasets: - pascal-person-part pipeline_tag: image-segmentation --- # SCHP — Self-Correction Human Parsing (Pascal Person Part, 7 classes) **SCHP** (Self-Correction for Human Parsing) is a state-of-the-art human parsing model based on a ResNet-101 backbone. This checkpoint is trained on the **Pascal Person Part** dataset and packaged for the 🤗 Transformers `AutoModel` API. > Original repository: [PeikeLi/Self-Correction-Human-Parsing](https://github.com/PeikeLi/Self-Correction-Human-Parsing) | Source image | Segmentation result | |:---:|:---:| | ![demo](./assets/demo.jpg) | ![demo-pascal](./assets/demo_pascal.png) | **Use cases:** - 🏃 **Body part segmentation** — segment coarse body regions (head, torso, arms, legs) for pose-aware applications - 🎮 **Avatar rigging** — generate body part masks as a preprocessing step for AR/VR avatars - 🏥 **Medical / ergonomics** — coarse body region detection for posture analysis or wearable device placement - 📐 **Body proportion estimation** — measure relative areas of body segments in 2D images ## Dataset — Pascal Person Part Pascal Person Part is a single-person human parsing dataset with 3 000+ images focused on **body part segmentation**. - **mIoU on Pascal Person Part validation: 71.46%** - 7 coarse labels covering body regions ## Labels | ID | Label | |----|-------| | 0 | Background | | 1 | Head | | 2 | Torso | | 3 | Upper Arms | | 4 | Lower Arms | | 5 | Upper Legs | | 6 | Lower Legs | ## Usage — PyTorch ```python from transformers import AutoImageProcessor, AutoModelForSemanticSegmentation from PIL import Image import torch model = AutoModelForSemanticSegmentation.from_pretrained("pirocheto/schp-pascal-7", trust_remote_code=True) processor = AutoImageProcessor.from_pretrained("pirocheto/schp-pascal-7", trust_remote_code=True) image = Image.open("photo.jpg").convert("RGB") inputs = processor(images=image, return_tensors="pt") with torch.no_grad(): outputs = model(**inputs) # outputs.logits — (1, 7, 512, 512) raw logits # outputs.parsing_logits — (1, 7, 512, 512) refined parsing logits # outputs.edge_logits — (1, 1, 512, 512) edge prediction logits seg_map = outputs.logits.argmax(dim=1).squeeze().numpy() # (H, W), values in [0, 6] ``` Each pixel in `seg_map` is a label ID. To map IDs back to names: ```python id2label = model.config.id2label print(id2label[1]) # → "Head" ``` ## Usage — ONNX Runtime Optimized ONNX files are available in the `onnx/` folder of this repo: | File | Size | Notes | |------|------|-------| | `onnx/schp-pascal-7.onnx` + `.onnx.data` | ~257 MB | FP32, dynamic batch | | `onnx/schp-pascal-7-int8-static.onnx` | ~66 MB | INT8 static, 99.77% pixel agreement | ```python import onnxruntime as ort import numpy as np from huggingface_hub import hf_hub_download from transformers import AutoImageProcessor from PIL import Image model_path = hf_hub_download("pirocheto/schp-pascal-7", "onnx/schp-pascal-7-int8-static.onnx") processor = AutoImageProcessor.from_pretrained("pirocheto/schp-pascal-7", trust_remote_code=True) sess_opts = ort.SessionOptions() sess_opts.intra_op_num_threads = 8 sess = ort.InferenceSession(model_path, sess_opts, providers=["CPUExecutionProvider"]) image = Image.open("photo.jpg").convert("RGB") inputs = processor(images=image, return_tensors="np") logits = sess.run(["logits"], {"pixel_values": inputs["pixel_values"]})[0] seg_map = logits.argmax(axis=1).squeeze() # (H, W) ``` ## Performance Benchmarked on CPU (16-core, 8 ORT threads, `intra_op_num_threads=8`): | Backend | Latency | Speedup | Size | |---------|---------|---------|------| | PyTorch FP32 | ~424 ms | 1× | 255 MB | | ONNX FP32 | ~296 ms | 1.44× | 256 MB | | ONNX INT8 static | ~218 ms | **1.94×** | **66 MB** | INT8 static quantization achieves **99.77% pixel-level agreement** with the FP32 model. ## Model Details | Property | Value | |----------|-------| | Architecture | ResNet-101 + SCHP self-correction | | Input size | 512 × 512 | | Output | 3 heads: logits, parsing_logits, edge_logits | | num_labels | 7 | | Dataset | Pascal Person Part | | Original mIoU | 71.46% |