Instructions to use py-feat/bs_to_au with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Py-Feat
How to use py-feat/bs_to_au with Py-Feat:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
| license: mit | |
| library_name: py-feat | |
| tags: | |
| - face-action-units | |
| - facs | |
| - facial-landmarks | |
| - regression | |
| - pls | |
| - mediapipe | |
| - py-feat | |
| # BS → AU PLS | |
| > Predicts 20 FACS Action Unit intensities from 52 MediaPipe blendshapes via Cheong-style PLS regression. Lets MPDetector output AU columns comparable to Detector's xgb output. | |
| # BS -> AU PLS (v2) | |
| Linear PLS regression mapping 52 MediaPipe blendshapes to 20 FACS Action Unit | |
| intensities. Used to give MPDetector an AU output stream comparable to | |
| Detector's xgb AU model output. | |
| ## Training data | |
| - 350,568 frames from ~10,000 CelebV-HQ celebrity videos | |
| - Paired blendshapes (MPDetector mp_blendshapes head) + AU intensities | |
| (Detector with img2pose face + xgb AU on the same frames) | |
| - Pose-filtered to |yaw| <= 40°, |pitch| <= 30° -> 347,897 retained | |
| - 9,994 unique videos after filtering | |
| - See /Storage/Projects/mp_blendshapes for the underlying training pipeline | |
| ## Method | |
| - PLSRegression(n_components=20, scale=True), Cheong / Py-Feat style | |
| - 20 components = full rank (capped at min(n_features=52, n_targets=20)) | |
| - Linear features only — pairwise BS interactions were tested in nested CV | |
| (2026-05-05) and HURT out-of-sample R² (bs_only=0.236 vs bs_pairs=0.214, | |
| with 4-6x higher fold std) | |
| - No pose covariates: kept pose-agnostic since MP blendshapes are | |
| designed to be pose-canonical | |
| - No clipping at training (clip to [0,1] at inference if desired) | |
| ## Performance (3-fold GroupKFold by video_id) | |
| - Overall R² = 0.236 +/- 0.008 (variance-weighted across 20 AUs) | |
| - Overall MAE = 0.171 | |
| - Strong on AU06/12/43 (~0.50) | |
| - Moderate on AU01/02/09 (~0.29) | |
| - Weak on AU11/15/28 (<0.10) — these are rare or visually subtle AUs | |
| ## Citation context | |
| - Cheong et al. 2023 (Py-Feat AU visualization model, tutorial 06 by E. Jolly): | |
| affine-aligned 68 dlib landmarks -> 20 AUs via PLS on EmotioNet/DISFA/BP4D | |
| (~13K class-balanced rows). Our model scales the recipe up to MP blendshapes | |
| on 10x larger wild-celebrity data. | |
| ## Inference | |
| The saved coef + intercept absorb PLSRegression's scale=True standardization, | |
| so inference is a single matmul: | |
| au = blendshapes @ coef + intercept # (n, 52) @ (52, 20) + (20,) = (n, 20) | |
| au = np.clip(au, 0.0, 1.0) # optional | |
| ## File format | |
| NPZ with: | |
| - coef (52, 20) float32 — linear weights, rows match bs_columns | |
| - intercept (20,) float32 — bias, matches au_columns | |
| - bs_columns (52,) str — input feature order | |
| - au_columns (20,) str — output AU order | |
| - model_card () str — this markdown | |
| - training_metadata () str — JSON dict with training context | |
| Loader: np.load("bs_to_au_pls_v2.npz") — no extra deps needed. | |