AU → 68-pt dlib landmarks (py-feat visualization model)

This repository hosts the AU → 68-point dlib facial-landmark visualization model used by py-feat's feat.plotting.predict() / load_viz_model() to render schematic facial expressions from action-unit (AU) intensity vectors.

Two versions are provided. The v2 PLS model is the default; v1 is preserved for reproducibility / rollback.

v2 (default, recommended) — `au_to_landmarks_pls_v2.npz`

Cheong / Py-Feat-tutorial-06 style: linear PLS regression mapping 20 FACS AU intensities to 68 mobilefacenet dlib landmarks (136-d output, 68 × 2). Re-trained on 27× more data than v1 with stricter preprocessing.

Training data

350,568 frames from ~10,000 CelebV-HQ celebrity videos
Landmark source: Detector(landmark_model="mobilefacenet") (the default Detector landmark model)
AU source: Detector(face_model="img2pose", au_model="xgb") (the default AU model)
Pose source: img2pose 6DoF (Pitch, Yaw, Roll, radians)
Pose-filtered to |yaw| ≤ 40°, |pitch| ≤ 30°
2D Procrustes + GPA aligned 68-pt landmarks (8-anchor, population-mean reference)
NO per-subject neutral subtraction (population mean shape IS the AU=0 target)

Method

PLSRegression(n_components=20, scale=True) — full input rank
Training inputs: 20 AU + 3 pose + 60 pose×AU = 83 features
Deployed inputs: 23 (AU + pose); pose×AU vanish at pose=0 by construction
No pairwise AU interactions (tested and degraded OOS R²)

Out-of-sample performance (3-fold GroupKFold by video_id)

R² = 0.794 ± 0.004 (variance-weighted across 136 dims)
MAE = 3.59 px

For comparison, v1 (Cheong et al. 2023) was trained on ~13K class-balanced rows from EmotioNet/DISFA/BP4D and reports R² ≈ 0.4–0.5.

Inference

import numpy as np
m = np.load("au_to_landmarks_pls_v2.npz")
au = np.zeros(20); au[m["au_columns"].tolist().index("AU12")] = 1.0  # smile
pose = np.zeros(3)                                                    # pitch, yaw, roll
x = np.concatenate([au, pose])                                        # (23,)
flat = x @ m["coef"] + m["intercept"]                                # (136,)
# IMPORTANT: layout is axis-major [all x | all y], NOT interleaved
landmarks = np.stack([flat[:68], flat[68:]], axis=1)                  # (68, 2)

NPZ also includes:

mean_aligned_landmarks (68, 2) — population mean canonical landmarks
mean_low_au_landmarks (68, 2) — mean of low-AU-sum frames (cleaner neutral)
reference_anchors (8, 2) for input alignment at inference
model_card, training_metadata

v1 (legacy) — `pyfeat_aus_to_landmarks.{joblib,h5}`

Original Cheong et al. 2023 / Py-Feat tutorial-06 model. Trained on EmotioNet + DISFA + BP4D class-balanced subsample (~13K rows), affine registration to a single neutral template, 20-component PLSRegression.

Preserved here for reproducibility and to allow rollback. Loadable via feat.plotting.load_viz_model("pyfeat_aus_to_landmarks") if you want the original predictions.

See py-feat tutorial 06 for the original training recipe.

File summary

File	Version	Format	Size	Status
`au_to_landmarks_pls_v2.npz`	v2	NPZ	20 KB	default; loaded automatically by py-feat
`pyfeat_aus_to_landmarks.joblib`	v1	sklearn pickle	24 MB	legacy fallback
`pyfeat_aus_to_landmarks.h5`	v1	HDF5 metadata	20 MB	legacy fallback (paired with .joblib)

Figures

Citation

If you use these models, please cite py-feat and Cheong et al. 2023 (for the original v1 method, which v2 extends):

Cheong, J. H., Jolly, E., et al. (2023). Py-Feat: Python Facial Expression Analysis Toolbox. Affective Science. https://doi.org/10.1007/s42761-023-00191-4

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support