AU → 68-pt dlib landmarks (py-feat visualization model)

This repository hosts the AU → 68-point dlib facial-landmark visualization model used by py-feat's feat.plotting.predict() / load_viz_model() to render schematic facial expressions from action-unit (AU) intensity vectors.

Two versions are provided. The v2 PLS model is the default; v1 is preserved for reproducibility / rollback.


v2 (default, recommended) — au_to_landmarks_pls_v2.npz

Cheong / Py-Feat-tutorial-06 style: linear PLS regression mapping 20 FACS AU intensities to 68 mobilefacenet dlib landmarks (136-d output, 68 × 2). Re-trained on 27× more data than v1 with stricter preprocessing.

Training data

  • 350,568 frames from ~10,000 CelebV-HQ celebrity videos
  • Landmark source: Detector(landmark_model="mobilefacenet") (the default Detector landmark model)
  • AU source: Detector(face_model="img2pose", au_model="xgb") (the default AU model)
  • Pose source: img2pose 6DoF (Pitch, Yaw, Roll, radians)
  • Pose-filtered to |yaw| ≤ 40°, |pitch| ≤ 30°
  • 2D Procrustes + GPA aligned 68-pt landmarks (8-anchor, population-mean reference)
  • NO per-subject neutral subtraction (population mean shape IS the AU=0 target)

Method

  • PLSRegression(n_components=20, scale=True) — full input rank
  • Training inputs: 20 AU + 3 pose + 60 pose×AU = 83 features
  • Deployed inputs: 23 (AU + pose); pose×AU vanish at pose=0 by construction
  • No pairwise AU interactions (tested and degraded OOS R²)

Out-of-sample performance (3-fold GroupKFold by video_id)

  • R² = 0.794 ± 0.004 (variance-weighted across 136 dims)
  • MAE = 3.59 px

For comparison, v1 (Cheong et al. 2023) was trained on ~13K class-balanced rows from EmotioNet/DISFA/BP4D and reports R² ≈ 0.4–0.5.

Inference

import numpy as np
m = np.load("au_to_landmarks_pls_v2.npz")
au = np.zeros(20); au[m["au_columns"].tolist().index("AU12")] = 1.0  # smile
pose = np.zeros(3)                                                    # pitch, yaw, roll
x = np.concatenate([au, pose])                                        # (23,)
flat = x @ m["coef"] + m["intercept"]                                # (136,)
# IMPORTANT: layout is axis-major [all x | all y], NOT interleaved
landmarks = np.stack([flat[:68], flat[68:]], axis=1)                  # (68, 2)

NPZ also includes:

  • mean_aligned_landmarks (68, 2) — population mean canonical landmarks
  • mean_low_au_landmarks (68, 2) — mean of low-AU-sum frames (cleaner neutral)
  • reference_anchors (8, 2) for input alignment at inference
  • model_card, training_metadata

v1 (legacy) — pyfeat_aus_to_landmarks.{joblib,h5}

Original Cheong et al. 2023 / Py-Feat tutorial-06 model. Trained on EmotioNet + DISFA + BP4D class-balanced subsample (~13K rows), affine registration to a single neutral template, 20-component PLSRegression.

Preserved here for reproducibility and to allow rollback. Loadable via feat.plotting.load_viz_model("pyfeat_aus_to_landmarks") if you want the original predictions.

See py-feat tutorial 06 for the original training recipe.


File summary

File Version Format Size Status
au_to_landmarks_pls_v2.npz v2 NPZ 20 KB default; loaded automatically by py-feat
pyfeat_aus_to_landmarks.joblib v1 sklearn pickle 24 MB legacy fallback
pyfeat_aus_to_landmarks.h5 v1 HDF5 metadata 20 MB legacy fallback (paired with .joblib)

Figures

au_landmarks_effect_maps.png au_landmarks_neutral_vs_activated.png au_landmarks_sweep.png au_compare_68_vs_478.png

Citation

If you use these models, please cite py-feat and Cheong et al. 2023 (for the original v1 method, which v2 extends):

Cheong, J. H., Jolly, E., et al. (2023). Py-Feat: Python Facial Expression Analysis Toolbox. Affective Science. https://doi.org/10.1007/s42761-023-00191-4

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support