py-feat
/

face_multitask_v2

Image Classification

facial-expression-analysis

emotion-recognition

gaze-estimation

Model card Files Files and versions

ljchang commited on 19 days ago

Commit

52a0cf7

·

verified ·

1 Parent(s): cdb00be

Update README.md

Files changed (1) hide show

README.md +4 -8

README.md CHANGED Viewed

@@ -21,10 +21,10 @@ A single multi-task convolutional model for facial behavior analysis, used by
 [py-feat](https://github.com/cosanlab/py-feat)'s `Detectorv2`. From one face crop
 it jointly predicts **action units, categorical emotion, valence/arousal,
 eye gaze, a 478-point face mesh, 6-DoF head pose, and 52 MediaPipe/ARKit
-blendshapes** (the v2.5 model; replaces v2.4).
 - **Backbone:** ConvNeXt-V2 Tiny (FCMAE + IN-22k/IN-1k pretrained)
-- **Heads:** ME-GraphAU AU graph (AFG/FGG/SC) + unified-feature emotion/V-A and
   gaze heads + landmark, pose, and **blendshape** regression heads
 - **Params:** ~30M · **Input:** 224×224 RGB (from a 256×256 face crop)
 - **File:** `face_multitask_v2.safetensors` (safetensors; `ModelV2Config` JSON in the file metadata)
@@ -54,12 +54,8 @@ blendshapes** (the v2.5 model; replaces v2.4).
 | Gaze | MPIIGaze (leave-subject-out) | mean angular err | 7.05° |
 | Gaze | Gaze360 (held-out split) | mean angular err | 12.89° |
-Notes: **v2.5 = v2.4 architecture + a blendshape head**, and it beats v2.4 on every
-accuracy benchmark — most dramatically AffectNet emotion (acc 0.35→0.62) and
-Aff-Wild2 V/A (0.82/0.78 → 0.85/0.80). **Gaze numbers are now leave-subject-out
-held-out** (honest generalization); the lower v2.4 figures (3.92°/6.81°) came from a
-leaky evaluation that included training subjects, so they are not comparable — the
-v2.5 numbers are the real ones. Numbers are from the deployed checkpoint
 (`v25c_release_ep14`), weight-verified against the published `.safetensors`.
 ## Usage

 [py-feat](https://github.com/cosanlab/py-feat)'s `Detectorv2`. From one face crop
 it jointly predicts **action units, categorical emotion, valence/arousal,
 eye gaze, a 478-point face mesh, 6-DoF head pose, and 52 MediaPipe/ARKit
+blendshapes**.
 - **Backbone:** ConvNeXt-V2 Tiny (FCMAE + IN-22k/IN-1k pretrained)
+- **Heads:** AU graph (AFG/FGG/SC) + unified-feature emotion/V-A and
   gaze heads + landmark, pose, and **blendshape** regression heads
 - **Params:** ~30M · **Input:** 224×224 RGB (from a 256×256 face crop)
 - **File:** `face_multitask_v2.safetensors` (safetensors; `ModelV2Config` JSON in the file metadata)
 | Gaze | MPIIGaze (leave-subject-out) | mean angular err | 7.05° |
 | Gaze | Gaze360 (held-out split) | mean angular err | 12.89° |
+Notes: **Gaze numbers are now leave-subject-out
+held-out** (honest generalization); Numbers are from the deployed checkpoint
 (`v25c_release_ep14`), weight-verified against the published `.safetensors`.
 ## Usage