---
license: mit
library_name: py-feat
tags:
  - face-action-units
  - facs
  - facial-landmarks
  - regression
  - pls
  - mediapipe
  - py-feat
---

# BS → AU PLS

> Predicts 20 FACS Action Unit intensities from 52 MediaPipe blendshapes via Cheong-style PLS regression. Lets MPDetector output AU columns comparable to Detector's xgb output.

# BS -> AU PLS (v2)

Linear PLS regression mapping 52 MediaPipe blendshapes to 20 FACS Action Unit
intensities. Used to give MPDetector an AU output stream comparable to
Detector's xgb AU model output.

## Training data
- 350,568 frames from ~10,000 CelebV-HQ celebrity videos
- Paired blendshapes (MPDetector mp_blendshapes head) + AU intensities
  (Detector with img2pose face + xgb AU on the same frames)
- Pose-filtered to |yaw| <= 40°, |pitch| <= 30° -> 347,897 retained
- 9,994 unique videos after filtering
- See /Storage/Projects/mp_blendshapes for the underlying training pipeline

## Method
- PLSRegression(n_components=20, scale=True), Cheong / Py-Feat style
- 20 components = full rank (capped at min(n_features=52, n_targets=20))
- Linear features only — pairwise BS interactions were tested in nested CV
  (2026-05-05) and HURT out-of-sample R² (bs_only=0.236 vs bs_pairs=0.214,
  with 4-6x higher fold std)
- No pose covariates: kept pose-agnostic since MP blendshapes are
  designed to be pose-canonical
- No clipping at training (clip to [0,1] at inference if desired)

## Performance (3-fold GroupKFold by video_id)
- Overall R² = 0.236 +/- 0.008  (variance-weighted across 20 AUs)
- Overall MAE = 0.171
- Strong on AU06/12/43 (~0.50)
- Moderate on AU01/02/09 (~0.29)
- Weak on AU11/15/28 (<0.10)  — these are rare or visually subtle AUs

## Citation context
- Cheong et al. 2023 (Py-Feat AU visualization model, tutorial 06 by E. Jolly):
  affine-aligned 68 dlib landmarks -> 20 AUs via PLS on EmotioNet/DISFA/BP4D
  (~13K class-balanced rows). Our model scales the recipe up to MP blendshapes
  on 10x larger wild-celebrity data.

## Inference
The saved coef + intercept absorb PLSRegression's scale=True standardization,
so inference is a single matmul:

    au = blendshapes @ coef + intercept   # (n, 52) @ (52, 20) + (20,) = (n, 20)
    au = np.clip(au, 0.0, 1.0)            # optional

## File format
NPZ with:
  - coef           (52, 20) float32  — linear weights, rows match bs_columns
  - intercept      (20,)    float32  — bias, matches au_columns
  - bs_columns     (52,)    str      — input feature order
  - au_columns     (20,)    str      — output AU order
  - model_card     ()       str      — this markdown
  - training_metadata ()    str      — JSON dict with training context

Loader: np.load("bs_to_au_pls_v2.npz") — no extra deps needed.