Spaces:

evanlyhf
/

RememberMe

Sleeping

App Files Files Community

Evan Li commited on 22 days ago

Commit

ee3a08a

1 Parent(s): dfb09f4

Relabeling, discarding CLIP, replacing attributes where they can with new models or mediapipe

Browse files

Files changed (14) hide show

Dockerfile +4 -5
README.md +30 -19
analyzers/__init__.py +8 -0
analyzers/attribute_analyzer.py +0 -194
analyzers/color_analyzer.py +128 -75
analyzers/demographic_analyzer.py +105 -30
analyzers/emotion_analyzer.py +61 -30
analyzers/hair_type_analyzer.py +87 -0
analyzers/landmark_analyzer.py +113 -28
analyzers/obstruction_analyzer.py +108 -0
analyzers/parsing_analyzer.py +78 -35
app.py +124 -53
architecture.md +99 -1707
requirements.txt +0 -3

Dockerfile CHANGED Viewed

@@ -13,15 +13,14 @@ WORKDIR /app
 COPY requirements.txt .
 RUN pip install --no-cache-dir -r requirements.txt
-# Pre-download MediaPipe model at build time so first request is fast
 RUN mkdir -p models && \
     wget -q -O models/face_landmarker.task \
     "https://storage.googleapis.com/mediapipe-models/face_landmarker/face_landmarker/float16/latest/face_landmarker.task"
-# Pre-download FaRL (face-tuned CLIP ViT-B/16) weights for attribute classifier
-RUN wget -q -O models/FaRL-Base-Patch16-LAIONFace20M-ep64.pth \
-    "https://github.com/FacePerceiver/FaRL/releases/download/pretrained_weights/FaRL-Base-Patch16-LAIONFace20M-ep64.pth"
 COPY . .
 EXPOSE 7860

 COPY requirements.txt .
 RUN pip install --no-cache-dir -r requirements.txt
+# Pre-download MediaPipe model at build time so first request is fast.
+# All other models (FairFace, SegFormer, HSEmotion, ObstructionViT,
+# HairTypeViT) are pulled from Hugging Face on first request and cached
+# in /root/.cache/huggingface for the rest of the process lifetime.
 RUN mkdir -p models && \
     wget -q -O models/face_landmarker.task \
     "https://storage.googleapis.com/mediapipe-models/face_landmarker/face_landmarker/float16/latest/face_landmarker.task"
 COPY . .
 EXPOSE 7860

README.md CHANGED Viewed

@@ -10,25 +10,33 @@ pinned: false
 # HCP Face Analysis Microservice
-A FastAPI-based facial analysis service that combines 6 specialized ML models
-to extract 100+ facial attributes from a single photograph.
-## Models Used
-| Model | Task | Size |
-|-------|------|------|
-| MediaPipe Face Landmarker | 478 3D landmarks + blendshapes | ~4 MB |
-| FairFace ResNet-34 | Age, gender, ethnicity | ~90 MB |
-| CelebA ResNet-18 | 40 binary attributes | ~44 MB |
-| BiSeNet | Face region segmentation | ~50 MB |
-| HSEmotion EfficientNet-B0 | 8-class emotion | ~20 MB |
-| Custom color analysis | Skin/eye/hair color | 0 MB |
-## API Endpoints
-- `GET /health` — Health check
-- `POST /analyze` — Multipart file upload
-- `POST /analyze-base64` — JSON body with base64 image
 ## Usage
@@ -37,3 +45,6 @@ curl -X POST https://YOUR-SPACE.hf.space/analyze-base64 \
   -H "Content-Type: application/json" \
   -d '{"image": "<base64-encoded-image>"}'
 ```

 # HCP Face Analysis Microservice
+FastAPI service that runs seven specialized analyzers over a single photo
+and returns a merged dictionary of ~100 facial attributes.
+## Models
+| # | Component | Model | Task | Size |
+|---|-----------|-------|------|------|
+| 1 | MediaPipe Face Landmarker | `face_landmarker.task` (Google) | 478 3D landmarks + 52 ARKit blendshapes — geometric features, smiling, mouth-open | ~4 MB |
+| 2 | FairFace age | `dima806/fairface_age_image_detection` (ViT-B/16) | 9-bucket age → softmax-weighted continuous estimate | ~340 MB |
+| 2 | FairFace gender | `dima806/fairface_gender_image_detection` (ViT-B/16) | Binary gender (~93.4% acc) | ~340 MB |
+| 2 | Ethnicity | `cledoux42/Ethnicity_Test_v003` (ViT) | 5-class ethnicity (~79.6% acc) | ~340 MB |
+| 3 | Human parsing | `matei-dorian/segformer-b5-finetuned-human-parsing` | 18-class pixel segmentation → masks + hair length + hat | ~340 MB |
+| 4 | Emotion | HSEmotion `enet_b0_8_best_afew` (EfficientNet-B0) | 8-class emotion + valence/arousal | ~20 MB |
+| 5 | Color analysis | (no model — OpenCV LAB/HSV) | Skin tone, hair color, eye color, lip color | 0 MB |
+| 6 | Obstruction | `dima806/face_obstruction_image_detection` (ViT-B/16) | glasses / sunglasses / mask (~99% precision) | ~340 MB |
+| 7 | Hair type | `dima806/hair_type_image_detection` (ViT-B/16) | curly/dreadlocks/kinky/straight/wavy (~93% acc) | ~340 MB |
+All analyzers are lazy-loaded on first request. The MediaPipe weight
+file is pre-downloaded at Docker build time; all Hugging Face models
+are cached on first inference.
+## API endpoints
+- `GET /` — service info
+- `GET /health` — liveness check
+- `POST /analyze` — multipart file upload
+- `POST /analyze-base64` — JSON `{ "image": "<base64>" }`
 ## Usage
   -H "Content-Type: application/json" \
   -d '{"image": "<base64-encoded-image>"}'
 ```
+See [architecture.md](./architecture.md) for the pipeline diagram and the
+full per-attribute model attribution table.

analyzers/__init__.py CHANGED Viewed

	@@ -1 +1,9 @@
1	# face-service analyzers package

 # face-service analyzers package
+#
+# Each analyzer in this package exposes a class with:
+#   __init__(self)                    — load model, register device
+#   analyze(self, img_rgb) -> dict    — run inference, return attribute dict
+#
+# Analyzers are independent: they don't import from each other. Cross-
+# analyzer plumbing (passing SegFormer masks into ColorAnalyzer, etc.)
+# is orchestrated entirely in app.py.

analyzers/attribute_analyzer.py DELETED Viewed

@@ -1,194 +0,0 @@
-"""
-FaRL-based facial attribute classification.
-Same CLIP ViT-B/16 architecture as before, but loaded with FaRL weights
-(CVPR 2022) which were pretrained on LAION-Face — the 50M face-text-pair
-subset of LAION-400M — instead of OpenAI's generic web crawl. The encoder
-discriminates facial attributes much better while keeping the prompt-pair
-zero-shot interface intact.
-Falls back to vanilla OpenAI CLIP ViT-B/16 if the FaRL .pth is missing.
-"""
-import os
-from pathlib import Path
-from typing import Any
-import clip
-import torch
-from PIL import Image
-CLIP_ARCH = "ViT-B/16"
-FARL_WEIGHTS_PATH = os.environ.get(
-    "FARL_WEIGHTS_PATH", "models/FaRL-Base-Patch16-LAIONFace20M-ep64.pth"
-)
-PAIRS = {
-    "wearing_glasses": ("wearing eyeglasses", "not wearing eyeglasses"),
-    "wearing_hat": ("wearing a hat", "not wearing a hat"),
-    "has_beard": ("has a beard", "does not have a beard"),
-    "mustache": ("has a mustache", "does not have a mustache"),
-    "goatee": ("has a goatee", "does not have a goatee"),
-    "sideburns": ("has sideburns", "does not have sideburns"),
-    "has_bangs": ("has bangs", "does not have bangs"),
-    "is_bald": ("is bald", "has hair"),
-    "receding_hairline": ("has a receding hairline", "has a full hairline"),
-    "wearing_earrings": ("wearing earrings", "not wearing earrings"),
-    "wearing_necklace": ("wearing a necklace", "not wearing a necklace"),
-    "wearing_necktie": ("wearing a necktie", "not wearing a necktie"),
-    "heavy_makeup": ("wearing heavy makeup", "not wearing makeup"),
-    "wearing_lipstick": ("wearing lipstick", "not wearing lipstick"),
-    "big_nose": ("has a big nose", "has a small nose"),
-    "pointy_nose": ("has a pointy nose", "has a rounded nose"),
-    "big_lips": ("has big lips", "has thin lips"),
-    "high_cheekbones": ("has high cheekbones", "has low cheekbones"),
-    "oval_face_celeba": ("has an oval face", "has a non-oval face"),
-    "double_chin": ("has a double chin", "does not have a double chin"),
-    "chubby": ("has a chubby face", "has a slim face"),
-    "rosy_cheeks": ("has rosy cheeks", "does not have rosy cheeks"),
-    "bags_under_eyes": ("has bags under the eyes", "does not have bags under the eyes"),
-    "narrow_eyes": ("has narrow eyes", "has wide eyes"),
-    "arched_eyebrows": ("has arched eyebrows", "has straight eyebrows"),
-    "bushy_eyebrows": ("has bushy eyebrows", "has thin eyebrows"),
-    "pale_skin": ("has pale skin", "has medium skin"),
-    "attractive": ("an attractive face", "an ordinary face"),
-    "young": ("a young person", "an older person"),
-    "smiling_celeba": ("smiling", "not smiling"),
-    "mouth_open": ("mouth open", "mouth closed"),
-}
-HAIR_COLOR_LABELS = ["black hair", "blond hair", "brown hair", "gray hair"]
-HAIR_TEXTURE_LABELS = ["straight hair", "wavy hair", "curly hair"]
-ACCESSORY_THRESHOLD = 0.65
-ACCESSORY_KEYS = {
-    "wearing_earrings", "wearing_necklace", "wearing_necktie", "wearing_hat",
-    "heavy_makeup", "wearing_lipstick",
-}
-def _prompt(text: str) -> str:
-    return f"a photo of {text}"
-class AttributeAnalyzer:
-    def __init__(self):
-        self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
-        self.model = None
-        self.preprocess = None
-        try:
-            model, preprocess = clip.load(CLIP_ARCH, device="cpu")
-            weights_path = Path(FARL_WEIGHTS_PATH)
-            if weights_path.exists():
-                farl_state = torch.load(weights_path, map_location="cpu")
-                state = farl_state.get("state_dict", farl_state)
-                missing, unexpected = model.load_state_dict(state, strict=False)
-                print(
-                    f"[AttributeAnalyzer] Loaded FaRL weights from {weights_path} "
-                    f"(missing={len(missing)}, unexpected={len(unexpected)})"
-                )
-            else:
-                print(
-                    f"[AttributeAnalyzer] FaRL weights not found at {weights_path}; "
-                    "falling back to vanilla OpenAI CLIP ViT-B/16"
-                )
-            # Force float32 so per-pair softmax math is stable on both CPU and CUDA.
-            self.model = model.float().to(self.device).eval()
-            self.preprocess = preprocess
-        except Exception as exc:
-            print(f"[AttributeAnalyzer] Failed to load model: {exc}")
-    @torch.no_grad()
-    def analyze(self, img_rgb) -> dict[str, Any]:
-        if self.model is None or self.preprocess is None:
-            return self._empty_result()
-        pil = Image.fromarray(img_rgb)
-        image_tensor = self.preprocess(pil).unsqueeze(0).to(self.device)
-        image_features = self.model.encode_image(image_tensor)
-        image_features = image_features / image_features.norm(dim=-1, keepdim=True)
-        pair_scores: dict[str, float] = {}
-        for key, (positive, negative) in PAIRS.items():
-            pair_scores[key] = self._softmax_positive(
-                image_features, [_prompt(positive), _prompt(negative)]
-            )
-        color_scores = self._group_softmax(
-            image_features, [_prompt(x) for x in HAIR_COLOR_LABELS]
-        )
-        texture_scores = self._group_softmax(
-            image_features, [_prompt(x) for x in HAIR_TEXTURE_LABELS]
-        )
-        hair_color_name = HAIR_COLOR_LABELS[int(torch.argmax(torch.tensor(color_scores)))].split()[0]
-        hair_texture_name = HAIR_TEXTURE_LABELS[int(torch.argmax(torch.tensor(texture_scores)))].split()[0]
-        def flag(key: str) -> bool:
-            score = pair_scores.get(key, 0.0)
-            threshold = ACCESSORY_THRESHOLD if key in ACCESSORY_KEYS else 0.5
-            return score >= threshold
-        result: dict[str, Any] = {
-            "_celeba_raw": {k: round(v, 3) for k, v in pair_scores.items()},
-            "hair_color_celeba": hair_color_name,
-            "hair_color_scores": {
-                label.split()[0]: round(float(score), 3)
-                for label, score in zip(HAIR_COLOR_LABELS, color_scores)
-            },
-            "hair_texture_celeba": hair_texture_name,
-        }
-        for key in PAIRS:
-            result[key] = flag(key)
-        beard_score = pair_scores.get("has_beard", 0.0)
-        result["facial_hair"] = {
-            "5_o_clock_shadow": 0.45 < beard_score < 0.7,
-            "goatee": flag("goatee"),
-            "mustache": flag("mustache"),
-            "sideburns": flag("sideburns"),
-            "full_beard": beard_score > 0.7,
-        }
-        return result
-    @torch.no_grad()
-    def _softmax_positive(self, image_features: torch.Tensor, prompts: list[str]) -> float:
-        text_tokens = clip.tokenize(prompts).to(self.device)
-        text_features = self.model.encode_text(text_tokens)
-        text_features = text_features / text_features.norm(dim=-1, keepdim=True)
-        logits = (image_features @ text_features.T) * self.model.logit_scale.exp()
-        probs = torch.softmax(logits, dim=-1)[0]
-        return float(probs[0])
-    @torch.no_grad()
-    def _group_softmax(self, image_features: torch.Tensor, prompts: list[str]) -> list[float]:
-        text_tokens = clip.tokenize(prompts).to(self.device)
-        text_features = self.model.encode_text(text_tokens)
-        text_features = text_features / text_features.norm(dim=-1, keepdim=True)
-        logits = (image_features @ text_features.T) * self.model.logit_scale.exp()
-        probs = torch.softmax(logits, dim=-1)[0]
-        return [float(p) for p in probs]
-    @staticmethod
-    def _empty_result() -> dict[str, Any]:
-        base: dict[str, Any] = {
-            "_celeba_raw": {},
-            "hair_color_celeba": "unknown",
-            "hair_color_scores": {"black": 0.0, "blond": 0.0, "brown": 0.0, "gray": 0.0},
-            "hair_texture_celeba": "unknown",
-            "facial_hair": {
-                "5_o_clock_shadow": False,
-                "goatee": False,
-                "mustache": False,
-                "sideburns": False,
-                "full_beard": False,
-            },
-        }
-        for key in PAIRS:
-            base[key] = False
-        return base

analyzers/color_analyzer.py CHANGED Viewed

@@ -1,13 +1,35 @@
 """
-Color Analyzer — Pixel-level color extraction using masks from
-BiSeNet and landmarks from MediaPipe.
-Determines:
-- Skin tone (Fitzpatrick scale, LAB lightness, hex color)
-- Eye color (hue classification from iris region)
-- Hair color (LAB-trimmed median over hair mask)
-- Hair texture from local intensity variation (Laplacian std over eroded mask)
-- Lip color
 """
 from typing import Any
@@ -15,8 +37,9 @@ from typing import Any
 import cv2
 import numpy as np
-# Fitzpatrick scale boundaries based on LAB L* channel (true 0–100 range).
-# OpenCV's uint8 LAB stores L scaled to 0–255, so we rescale before lookup.
 FITZPATRICK_SCALE = [
     (85, 100, "Type I - Very Fair"),
     (70, 85, "Type II - Fair"),
@@ -26,24 +49,15 @@ FITZPATRICK_SCALE = [
     (0, 25, "Type VI - Dark Brown/Black"),
 ]
-EYE_COLOR_RANGES = {
-    "brown": {"h_range": (8, 28), "s_min": 50},
-    "hazel": {"h_range": (20, 35), "s_min": 40},
-    "green": {"h_range": (35, 80), "s_min": 30},
-    "blue": {"h_range": (90, 130), "s_min": 30},
-    "gray": {"h_range": (0, 180), "s_max": 30},
-    "amber": {"h_range": (15, 25), "s_min": 80},
-}
-# Hair-texture thresholds on std(Laplacian) computed over the *eroded* hair
-# mask (so the mask boundary itself doesn't contribute high-frequency energy).
-# These are reasonable starting points — tune on your own dataset.
 HAIR_TEXTURE_CURLY_THRESHOLD = 25.0
 HAIR_TEXTURE_WAVY_THRESHOLD = 15.0
-# MediaPipe FaceMesh lip contours. Outer ring traces the lip border;
-# inner ring traces the mouth opening — subtract one from the other
-# to get just the lip flesh and avoid sampling teeth or tongue.
 MEDIAPIPE_LIP_OUTER = [
     61, 146, 91, 181, 84, 17, 314, 405, 321, 375,
     291, 409, 270, 269, 267, 0, 37, 39, 40, 185,
@@ -56,7 +70,8 @@ MEDIAPIPE_LIP_INNER = [
 class ColorAnalyzer:
     def __init__(self):
-        pass  # No model to load — pure pixel analysis
     def analyze(
         self,
@@ -78,36 +93,38 @@ class ColorAnalyzer:
         if lip_mask is not None:
             lip_mask = lip_mask.astype(bool)
-        # SegFormer human-parsing has no dedicated lip class, so the
-        # parser hands us an empty mask. Fall back to MediaPipe lip
-        # landmarks whenever the parser-derived mask is missing or tiny.
         if (lip_mask is None or lip_mask.sum() < 50) and landmarks:
             derived = self._lip_mask_from_landmarks(landmarks, h, w)
             if derived is not None:
                 lip_mask = derived
         # ── Skin Tone ────────────────────────────────────────────────
         if skin_mask is not None and skin_mask.sum() > 100:
             skin_lab = cv2.cvtColor(img_rgb, cv2.COLOR_RGB2LAB)
             skin_pixels = skin_lab[skin_mask]
-            # OpenCV uint8 LAB stores L in 0–255 and a/b offset by +128.
-            # Rescale to the conventional ranges (L* in 0–100, a*/b* in
-            # roughly -128..127) so the Fitzpatrick bins and undertone
-            # thresholds operate in standard units.
             mean_l_raw = float(np.mean(skin_pixels[:, 0]))
             mean_l = mean_l_raw * 100.0 / 255.0
             mean_a = float(np.mean(skin_pixels[:, 1])) - 128.0
             mean_b = float(np.mean(skin_pixels[:, 2])) - 128.0
-            # Fitzpatrick type
             fitz = "Unknown"
             for low, high, label in FITZPATRICK_SCALE:
                 if low <= mean_l < high:
                     fitz = label
                     break
-            # Get hex color of average skin tone
             avg_rgb = np.mean(img_rgb[skin_mask], axis=0).astype(int)
             hex_color = "#{:02x}{:02x}{:02x}".format(*avg_rgb)
@@ -120,9 +137,10 @@ class ColorAnalyzer:
                 "rgb": avg_rgb.tolist(),
             }
-            # Undertone (warm/cool/neutral). Now that b* is centered on 0,
-            # positive b* leans yellow (warm) and negative b* leans blue
-            # (cool). Thresholds adjusted from the old 0–255 scale.
             if mean_b > 12:
                 result["skin_undertone"] = "warm"
             elif mean_b < -8:
@@ -134,32 +152,34 @@ class ColorAnalyzer:
             result["skin_undertone"] = "unknown"
         # ── Eye Color ────────────────────────────────────────────────
         if landmarks and len(landmarks) > 473:
-            eye_color = self._detect_eye_color(img_rgb, landmarks, h, w)
-            result["eye_color"] = eye_color
         elif landmarks and len(landmarks) > 362:
-            # Fallback: sample from rough iris area
-            eye_color = self._detect_eye_color_fallback(img_rgb, landmarks, h, w)
-            result["eye_color"] = eye_color
         else:
             result["eye_color"] = "unknown"
-        # ── Hair Color ───────────────────────────────────────────────
         if hair_mask is not None and hair_mask.sum() > 200:
-            hair_color_info = self._estimate_hair_color(img_rgb, hair_mask)
-            result["hair_color"] = hair_color_info
             result["hair_texture"] = self._estimate_hair_texture(img_rgb, hair_mask)
         else:
             result["hair_color"] = {"name": "unknown"}
             result["hair_texture"] = "unknown"
         # ── Lip Color ────────────────────────────────────────────────
         if lip_mask is not None and lip_mask.sum() > 50:
             lip_pixels = img_rgb[lip_mask]
             avg_lip = np.mean(lip_pixels, axis=0).astype(int)
             hex_lip = "#{:02x}{:02x}{:02x}".format(*avg_lip)
             lip_hsv = cv2.cvtColor(
                 avg_lip.reshape(1, 1, 3).astype(np.uint8),
                 cv2.COLOR_RGB2HSV
@@ -194,38 +214,43 @@ class ColorAnalyzer:
     def _estimate_hair_color(
         img_rgb: np.ndarray, hair_mask: np.ndarray
     ) -> dict[str, Any]:
-        """Estimate dominant hair color via LAB-lightness-trimmed median.
         Why median + L*-trim instead of k=2 k-means:
-        - K-means with k=2 splits highlight vs shadow within a single hair
-          color, so the "bigger cluster" can flip between photos of the same
-          person depending on lighting. Median is robust and deterministic.
-        - Trimming the top/bottom 10% of L* drops specular highlights and
-          deep shadows, which are the main outlier sources.
         """
         hair_pixels = img_rgb[hair_mask]  # (N, 3) uint8 RGB
-        # Trim by LAB L* to drop highlights and shadows.
         hair_lab = cv2.cvtColor(
             hair_pixels.reshape(-1, 1, 3), cv2.COLOR_RGB2LAB
         ).reshape(-1, 3)
         l_lo, l_hi = np.percentile(hair_lab[:, 0], [10, 90])
         keep = (hair_lab[:, 0] >= l_lo) & (hair_lab[:, 0] <= l_hi)
         core_pixels = hair_pixels[keep] if keep.sum() > 50 else hair_pixels
         dominant_rgb = np.median(core_pixels, axis=0)
         dominant_rgb = np.clip(dominant_rgb, 0, 255).astype(np.uint8)
         hex_hair = "#{:02x}{:02x}{:02x}".format(*dominant_rgb)
         hair_hsv = cv2.cvtColor(
             dominant_rgb.reshape(1, 1, 3), cv2.COLOR_RGB2HSV
         )[0, 0]
         h_val, s_val, v_val = int(hair_hsv[0]), int(hair_hsv[1]), int(hair_hsv[2])
-        # Classification cascade — order matters. Falls through to "unknown"
-        # rather than a default of "brown" so mask leakage / unusual tints
-        # are detectable downstream.
         if v_val < 45 and s_val < 60:
             hair_color_name = "black"
         elif s_val < 25:
@@ -234,8 +259,8 @@ class ColorAnalyzer:
         elif (h_val < 12 or h_val > 168) and s_val > 60:
             hair_color_name = "red/auburn"
         elif 18 <= h_val <= 35 and v_val > 160 and s_val < 140:
-            # Blond: yellow hue, high V, and not too saturated (real blond
-            # is desaturated yellow, not orange).
             hair_color_name = "blond"
         elif 5 <= h_val <= 30:
             hair_color_name = "brown" if v_val > 80 else "dark brown"
@@ -253,21 +278,28 @@ class ColorAnalyzer:
     def _estimate_hair_texture(
         img_rgb: np.ndarray, hair_mask: np.ndarray
     ) -> str:
-        """Estimate hair texture from local intensity variation.
-        Computes std(Laplacian) over an *eroded* hair mask. Erosion stays
-        strictly inside the hair region so the mask boundary itself doesn't
-        contribute the high-frequency step edge that the previous FFT-on-
-        zeroed-region implementation was inadvertently measuring.
         """
         kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 5))
         inner_mask = cv2.erode(
             hair_mask.astype(np.uint8), kernel, iterations=2
         ).astype(bool)
         if inner_mask.sum() < 200:
             return "unknown"
         hair_gray = cv2.cvtColor(img_rgb, cv2.COLOR_RGB2GRAY)
         lap = cv2.Laplacian(hair_gray, cv2.CV_64F, ksize=3)
         texture_score = float(np.std(lap[inner_mask]))
@@ -286,11 +318,15 @@ class ColorAnalyzer:
     def _lip_mask_from_landmarks(
         landmarks: list[dict], h: int, w: int
     ) -> np.ndarray | None:
-        """Build a lip-flesh mask by filling outer lip contour minus inner."""
         max_idx = max(MEDIAPIPE_LIP_OUTER + MEDIAPIPE_LIP_INNER)
         if len(landmarks) <= max_idx:
             return None
         def _poly(indices: list[int]) -> np.ndarray:
             return np.array(
                 [
@@ -300,6 +336,8 @@ class ColorAnalyzer:
                 dtype=np.int32,
             )
         mask = np.zeros((h, w), dtype=np.uint8)
         cv2.fillPoly(mask, [_poly(MEDIAPIPE_LIP_OUTER)], 255)
         cv2.fillPoly(mask, [_poly(MEDIAPIPE_LIP_INNER)], 0)
@@ -312,15 +350,17 @@ class ColorAnalyzer:
     def _detect_eye_color(
         self, img_rgb: np.ndarray, lm: list[dict], h: int, w: int
     ) -> str:
-        """Use iris landmarks (468-477) to sample eye color."""
-        iris_indices = list(range(468, 474))  # Left iris
         iris_points = [(int(lm[i]["x"] * w), int(lm[i]["y"] * h)) for i in iris_indices]
-        # Create a small mask around iris center
         cx = int(np.mean([p[0] for p in iris_points]))
         cy = int(np.mean([p[1] for p in iris_points]))
         radius = max(3, int(np.std([p[0] for p in iris_points]) * 1.5))
         mask = np.zeros((h, w), dtype=np.uint8)
         cv2.circle(mask, (cx, cy), radius, 255, -1)
@@ -333,11 +373,17 @@ class ColorAnalyzer:
     def _detect_eye_color_fallback(
         self, img_rgb: np.ndarray, lm: list[dict], h: int, w: int
     ) -> str:
-        """Fallback: sample from center of eye region."""
-        # Center of left eye
         eye_pts = [159, 145, 133, 33]
         cx = int(np.mean([lm[i]["x"] for i in eye_pts]) * w)
         cy = int(np.mean([lm[i]["y"] for i in eye_pts]) * h)
         radius = max(3, int(abs(lm[159]["y"] - lm[145]["y"]) * h * 0.3))
         mask = np.zeros((h, w), dtype=np.uint8)
@@ -351,7 +397,13 @@ class ColorAnalyzer:
     @staticmethod
     def _classify_eye_color(pixels: np.ndarray) -> str:
-        """Classify eye color from pixel samples using HSV."""
         hsv = cv2.cvtColor(
             pixels.reshape(-1, 1, 3).astype(np.uint8),
             cv2.COLOR_RGB2HSV
@@ -361,11 +413,11 @@ class ColorAnalyzer:
         mean_s = float(np.mean(hsv[:, 1]))
         mean_v = float(np.mean(hsv[:, 2]))
-        # Gray eyes: low saturation
         if mean_s < 30:
             return "gray"
-        # Classify by hue
         if 90 <= mean_h <= 130 and mean_s > 30:
             return "blue"
         if 35 <= mean_h <= 80 and mean_s > 30:
@@ -376,7 +428,8 @@ class ColorAnalyzer:
             return "amber"
         if 8 <= mean_h <= 28 and mean_s > 50:
             return "brown"
         if mean_v < 60:
             return "dark brown"
-        return "brown"

 """
+ColorAnalyzer — pixel-level color extraction.
+Model
+-----
+None. All operations are deterministic OpenCV LAB/HSV statistics over
+masks/landmarks supplied by upstream analyzers.
+Inputs
+------
+img_rgb    : np.ndarray (H, W, 3) uint8
+landmarks  : list[dict] of normalised MediaPipe landmarks (optional)
+skin_mask  : bool ndarray (H, W) from SegFormer "face" class (optional)
+hair_mask  : bool ndarray (H, W) from SegFormer "hair" class (optional)
+lip_mask   : bool ndarray (H, W) — usually None; falls back to MediaPipe
+             lip polygon when missing or too small
+Outputs (dict)
+--------------
+skin_tone        — {fitzpatrick, lab_lightness, lab_a, lab_b, hex_color, rgb}
+skin_undertone   — warm | cool | neutral
+eye_color        — brown | hazel | amber | green | blue | gray | dark brown
+hair_color       — {name, hex, rgb, hsv}
+hair_texture     — straight | wavy | curly/coily   (coarse Laplacian signal,
+                   the HairTypeViT analyzer is the authoritative source)
+lip_color        — {shade, hex, rgb}
+Notes
+-----
+LAB is preferred over RGB for skin tone classification because LAB's
+L* channel is a perceptual lightness — Fitzpatrick bins line up with
+fixed L* ranges regardless of camera white balance.
 """
 from typing import Any
 import cv2
 import numpy as np
+# Fitzpatrick scale boundaries on the LAB L* channel (true 0–100 range).
+# OpenCV's uint8 LAB stores L scaled to 0–255, so we rescale before
+# looking up bins.
 FITZPATRICK_SCALE = [
     (85, 100, "Type I - Very Fair"),
     (70, 85, "Type II - Fair"),
     (0, 25, "Type VI - Dark Brown/Black"),
 ]
+# Hair-texture thresholds on std(Laplacian) computed over the *eroded*
+# hair mask. Erosion prevents the mask boundary from contributing
+# high-frequency step-edge energy.
 HAIR_TEXTURE_CURLY_THRESHOLD = 25.0
 HAIR_TEXTURE_WAVY_THRESHOLD = 15.0
+# MediaPipe FaceMesh lip contours. The outer ring traces the lip
+# border; the inner ring traces the mouth opening. Filling outer
+# and then erasing inner gives only lip flesh, never teeth/tongue.
 MEDIAPIPE_LIP_OUTER = [
     61, 146, 91, 181, 84, 17, 314, 405, 321, 375,
     291, 409, 270, 269, 267, 0, 37, 39, 40, 185,
 class ColorAnalyzer:
     def __init__(self):
+        # No model to load — pure pixel arithmetic.
+        pass
     def analyze(
         self,
         if lip_mask is not None:
             lip_mask = lip_mask.astype(bool)
+        # SegFormer human-parsing has no lip class, so callers usually
+        # pass None for lip_mask. Build one from MediaPipe lip landmarks
+        # whenever it's missing or too small to sample reliably.
         if (lip_mask is None or lip_mask.sum() < 50) and landmarks:
             derived = self._lip_mask_from_landmarks(landmarks, h, w)
             if derived is not None:
                 lip_mask = derived
         # ── Skin Tone ────────────────────────────────────────────────
+        # Need at least ~100 face pixels for stable statistics.
         if skin_mask is not None and skin_mask.sum() > 100:
+            # Convert the whole image to LAB once and pull pixels under
+            # the mask. cv2 returns uint8 LAB with L in 0–255 and a/b
+            # offset by +128 (so neutral gray is L=128, a=128, b=128).
             skin_lab = cv2.cvtColor(img_rgb, cv2.COLOR_RGB2LAB)
             skin_pixels = skin_lab[skin_mask]
+            # Rescale to standard LAB ranges before applying the
+            # Fitzpatrick / undertone thresholds defined on those ranges.
             mean_l_raw = float(np.mean(skin_pixels[:, 0]))
             mean_l = mean_l_raw * 100.0 / 255.0
             mean_a = float(np.mean(skin_pixels[:, 1])) - 128.0
             mean_b = float(np.mean(skin_pixels[:, 2])) - 128.0
+            # Bin into Fitzpatrick types — linear search over six bands.
             fitz = "Unknown"
             for low, high, label in FITZPATRICK_SCALE:
                 if low <= mean_l < high:
                     fitz = label
                     break
+            # Average RGB → hex for display.
             avg_rgb = np.mean(img_rgb[skin_mask], axis=0).astype(int)
             hex_color = "#{:02x}{:02x}{:02x}".format(*avg_rgb)
                 "rgb": avg_rgb.tolist(),
             }
+            # Undertone from b* (yellow ↔ blue axis):
+            # b* > +12  → yellow-leaning, warm
+            # b* < -8   → blue-leaning,   cool
+            # in between → neutral
             if mean_b > 12:
                 result["skin_undertone"] = "warm"
             elif mean_b < -8:
             result["skin_undertone"] = "unknown"
         # ── Eye Color ────────────────────────────────────────────────
+        # Prefer the dedicated iris landmarks (468-477) when available.
+        # Fall back to a rough eye-centre crop otherwise.
         if landmarks and len(landmarks) > 473:
+            result["eye_color"] = self._detect_eye_color(img_rgb, landmarks, h, w)
         elif landmarks and len(landmarks) > 362:
+            result["eye_color"] = self._detect_eye_color_fallback(img_rgb, landmarks, h, w)
         else:
             result["eye_color"] = "unknown"
+        # ── Hair Color & Texture ────────────────────────────────────
+        # Need at least 200 hair pixels for a stable median.
         if hair_mask is not None and hair_mask.sum() > 200:
+            result["hair_color"] = self._estimate_hair_color(img_rgb, hair_mask)
             result["hair_texture"] = self._estimate_hair_texture(img_rgb, hair_mask)
         else:
             result["hair_color"] = {"name": "unknown"}
             result["hair_texture"] = "unknown"
         # ── Lip Color ────────────────────────────────────────────────
+        # Average the masked lip pixels and bucket by HSV saturation/value.
         if lip_mask is not None and lip_mask.sum() > 50:
             lip_pixels = img_rgb[lip_mask]
             avg_lip = np.mean(lip_pixels, axis=0).astype(int)
             hex_lip = "#{:02x}{:02x}{:02x}".format(*avg_lip)
+            # Convert the single average RGB triple to HSV for shade
+            # classification. High saturation → rosy/red; high value but
+            # low saturation → pink; low value → dark; otherwise natural.
             lip_hsv = cv2.cvtColor(
                 avg_lip.reshape(1, 1, 3).astype(np.uint8),
                 cv2.COLOR_RGB2HSV
     def _estimate_hair_color(
         img_rgb: np.ndarray, hair_mask: np.ndarray
     ) -> dict[str, Any]:
+        """Dominant hair color via LAB-lightness-trimmed median.
         Why median + L*-trim instead of k=2 k-means:
+        - K-means with k=2 splits highlight vs shadow within a single
+          hair color, so the "bigger cluster" can flip between photos
+          of the same person depending on lighting. Median is robust
+          and deterministic.
+        - Trimming the top/bottom 10% of L* drops specular highlights
+          and deep shadows, the main outlier sources.
         """
         hair_pixels = img_rgb[hair_mask]  # (N, 3) uint8 RGB
+        # LAB conversion so we can trim by perceptual lightness.
         hair_lab = cv2.cvtColor(
             hair_pixels.reshape(-1, 1, 3), cv2.COLOR_RGB2LAB
         ).reshape(-1, 3)
         l_lo, l_hi = np.percentile(hair_lab[:, 0], [10, 90])
         keep = (hair_lab[:, 0] >= l_lo) & (hair_lab[:, 0] <= l_hi)
+        # If trimming would leave us too few pixels, fall back to all.
         core_pixels = hair_pixels[keep] if keep.sum() > 50 else hair_pixels
+        # Median is robust to mask leakage (a few stray non-hair pixels
+        # don't shift the median).
         dominant_rgb = np.median(core_pixels, axis=0)
         dominant_rgb = np.clip(dominant_rgb, 0, 255).astype(np.uint8)
         hex_hair = "#{:02x}{:02x}{:02x}".format(*dominant_rgb)
+        # Bucket the dominant color into a name via HSV thresholds.
         hair_hsv = cv2.cvtColor(
             dominant_rgb.reshape(1, 1, 3), cv2.COLOR_RGB2HSV
         )[0, 0]
         h_val, s_val, v_val = int(hair_hsv[0]), int(hair_hsv[1]), int(hair_hsv[2])
+        # Classification cascade — order matters. Falls through to
+        # "unknown" instead of defaulting to a colour, so mask leakage
+        # and unusual tints stay detectable downstream.
         if v_val < 45 and s_val < 60:
             hair_color_name = "black"
         elif s_val < 25:
         elif (h_val < 12 or h_val > 168) and s_val > 60:
             hair_color_name = "red/auburn"
         elif 18 <= h_val <= 35 and v_val > 160 and s_val < 140:
+            # Blond is desaturated yellow with high V — bright but not
+            # too saturated (or it'd shade orange).
             hair_color_name = "blond"
         elif 5 <= h_val <= 30:
             hair_color_name = "brown" if v_val > 80 else "dark brown"
     def _estimate_hair_texture(
         img_rgb: np.ndarray, hair_mask: np.ndarray
     ) -> str:
+        """Coarse hair texture from local intensity variation.
+        Computes std(Laplacian) over an *eroded* hair mask so the mask
+        boundary itself doesn't contribute the high-frequency step
+        edge that an un-eroded mask would.
+        This is intentionally a fallback signal; the authoritative
+        hair-texture output is HairTypeViT (curly/dreadlocks/kinky/
+        straight/wavy), which is trained and ~93% accurate.
         """
+        # Erode by ~10 px so we sample only deep-interior hair pixels.
         kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 5))
         inner_mask = cv2.erode(
             hair_mask.astype(np.uint8), kernel, iterations=2
         ).astype(bool)
+        # Not enough interior pixels to compute a reliable std.
         if inner_mask.sum() < 200:
             return "unknown"
+        # Laplacian responds to local intensity curvature; its std over
+        # the masked region is a proxy for "how much fine detail".
         hair_gray = cv2.cvtColor(img_rgb, cv2.COLOR_RGB2GRAY)
         lap = cv2.Laplacian(hair_gray, cv2.CV_64F, ksize=3)
         texture_score = float(np.std(lap[inner_mask]))
     def _lip_mask_from_landmarks(
         landmarks: list[dict], h: int, w: int
     ) -> np.ndarray | None:
+        """Build a lip-flesh mask by filling outer minus inner contour."""
+        # Bail if the landmark list doesn't have indices the contours
+        # reference (e.g. iris-less subset).
         max_idx = max(MEDIAPIPE_LIP_OUTER + MEDIAPIPE_LIP_INNER)
         if len(landmarks) <= max_idx:
             return None
+        # Helper to convert a list of landmark indices into a pixel-
+        # space polygon in (x, y) order.
         def _poly(indices: list[int]) -> np.ndarray:
             return np.array(
                 [
                 dtype=np.int32,
             )
+        # Fill the outer ring, then erase the inner ring → lip flesh
+        # only, no teeth or tongue pixels.
         mask = np.zeros((h, w), dtype=np.uint8)
         cv2.fillPoly(mask, [_poly(MEDIAPIPE_LIP_OUTER)], 255)
         cv2.fillPoly(mask, [_poly(MEDIAPIPE_LIP_INNER)], 0)
     def _detect_eye_color(
         self, img_rgb: np.ndarray, lm: list[dict], h: int, w: int
     ) -> str:
+        """Sample left-iris pixels using MediaPipe iris landmarks (468–477)."""
+        # 468-473 cover the left iris ring; we average them to a centre
+        # and pick a radius from the std-dev of the x-coordinates.
+        iris_indices = list(range(468, 474))
         iris_points = [(int(lm[i]["x"] * w), int(lm[i]["y"] * h)) for i in iris_indices]
         cx = int(np.mean([p[0] for p in iris_points]))
         cy = int(np.mean([p[1] for p in iris_points]))
         radius = max(3, int(np.std([p[0] for p in iris_points]) * 1.5))
+        # Filled disc mask centred on the iris → classify those pixels.
         mask = np.zeros((h, w), dtype=np.uint8)
         cv2.circle(mask, (cx, cy), radius, 255, -1)
     def _detect_eye_color_fallback(
         self, img_rgb: np.ndarray, lm: list[dict], h: int, w: int
     ) -> str:
+        """Fallback when iris landmarks aren't available.
+        Averages four points that bound the eye opening and treats the
+        centre as a coarse "look here" target. Less accurate than the
+        iris-landmark path because we sample some sclera too, but it's
+        a graceful degradation.
+        """
         eye_pts = [159, 145, 133, 33]
         cx = int(np.mean([lm[i]["x"] for i in eye_pts]) * w)
         cy = int(np.mean([lm[i]["y"] for i in eye_pts]) * h)
+        # Radius scaled to ~30% of eye opening height.
         radius = max(3, int(abs(lm[159]["y"] - lm[145]["y"]) * h * 0.3))
         mask = np.zeros((h, w), dtype=np.uint8)
     @staticmethod
     def _classify_eye_color(pixels: np.ndarray) -> str:
+        """Bucket sampled iris pixels by HSV mean.
+        Hue ranges follow the standard OpenCV scale (H in 0–180, not
+        0–360). The cascade order matters: gray is checked first because
+        any sufficiently desaturated eye is gray regardless of its
+        nominal hue.
+        """
         hsv = cv2.cvtColor(
             pixels.reshape(-1, 1, 3).astype(np.uint8),
             cv2.COLOR_RGB2HSV
         mean_s = float(np.mean(hsv[:, 1]))
         mean_v = float(np.mean(hsv[:, 2]))
+        # Gray eyes: any hue, but low saturation.
         if mean_s < 30:
             return "gray"
+        # Hue-based buckets. Specific (amber) before general (brown).
         if 90 <= mean_h <= 130 and mean_s > 30:
             return "blue"
         if 35 <= mean_h <= 80 and mean_s > 30:
             return "amber"
         if 8 <= mean_h <= 28 and mean_s > 50:
             return "brown"
+        # Anything left with low V is just dark brown.
         if mean_v < 60:
             return "dark brown"
+        return "brown"

analyzers/demographic_analyzer.py CHANGED Viewed

@@ -1,13 +1,36 @@
 """
-Public pretrained demographic classifiers.
-Models used (all public, with published accuracy):
-- Age:       dima806/fairface_age_image_detection   (~59% top-1 on FairFace age buckets)
-- Gender:    dima806/fairface_gender_image_detection (~93.4% on FairFace)
-- Ethnicity: cledoux42/Ethnicity_Test_v003          (ViT, 79.6% accuracy, macro-F1 0.797)
-The ethnicity model replaces the former NikhilJaddu/fairface-race-vit checkpoint,
-which had no published performance metrics on the HF model card.
 """
 from typing import Any
@@ -22,20 +45,28 @@ RACE_MODEL_ID = "cledoux42/Ethnicity_Test_v003"
 AGE_LABELS = ["0-2", "3-9", "10-19", "20-29", "30-39", "40-49", "50-59", "60-69", "70+"]
 GENDER_LABELS = ["Male", "Female"]
-# cledoux42/Ethnicity_Test_v003 outputs 5 classes: african, asian, caucasian, hispanic, indian.
-# We keep the legacy 7-bucket schema internally so the rest of the app still works;
-# unseen buckets simply stay at 0.0 in the distribution.
 RACE_LABELS = ["White", "Black", "Latino_Hispanic", "East Asian", "Southeast Asian", "Indian", "Middle Eastern"]
 class DemographicAnalyzer:
     def __init__(self):
         self.age_classifier = self._load_classifier(AGE_MODEL_ID)
         self.gender_classifier = self._load_classifier(GENDER_MODEL_ID)
         self.race_classifier = self._load_classifier(RACE_MODEL_ID)
     @staticmethod
     def _load_classifier(model_id: str):
         try:
             return pipeline("image-classification", model=model_id)
         except Exception as exc:
@@ -43,12 +74,18 @@ class DemographicAnalyzer:
             return None
     def analyze(self, img_rgb) -> dict[str, Any]:
         pil = Image.fromarray(img_rgb)
-        age_predictions = self._safe_predict(self.age_classifier, pil, top_k=3)
         gender_predictions = self._safe_predict(self.gender_classifier, pil, top_k=2)
         race_predictions = self._safe_predict(self.race_classifier, pil, top_k=7)
         if not age_predictions and not gender_predictions and not race_predictions:
             return {
                 "age_range": "unknown",
@@ -62,17 +99,22 @@ class DemographicAnalyzer:
                 "ethnicity_distribution": {label: 0.0 for label in RACE_LABELS},
             }
         age_prediction = age_predictions[0] if age_predictions else {"label": "unknown", "score": 0.0}
         gender_prediction = gender_predictions[0] if gender_predictions else {"label": "unknown", "score": 0.0}
         race_prediction = race_predictions[0] if race_predictions else {"label": "unknown", "score": 0.0}
         age_label = self._normalize_age_label(age_prediction["label"])
         gender_label = self._normalize_gender_label(gender_prediction["label"])
         race_label = self._normalize_race_label(race_prediction["label"])
         return {
             "age_range": age_label,
-            "age_estimate": self._age_estimate_from_label(age_label),
             "age_confidence": round(float(age_prediction["score"]), 3),
             "gender": gender_label.lower(),
             "gender_confidence": round(float(gender_prediction["score"]), 3),
@@ -84,6 +126,7 @@ class DemographicAnalyzer:
     @staticmethod
     def _normalize_age_label(label: str) -> str:
         normalized = label.strip().lower()
         if normalized == "more than 70":
             return "70+"
@@ -98,9 +141,10 @@ class DemographicAnalyzer:
     @staticmethod
     def _normalize_race_label(label: str) -> str:
         normalized = label.strip().lower().replace("-", "_")
         race_aliases = {
-            # Original FairFace 7-class labels
             "white": "White",
             "black": "Black",
             "latino_hispanic": "Latino_Hispanic",
@@ -109,7 +153,7 @@ class DemographicAnalyzer:
             "southeast asian": "Southeast Asian",
             "indian": "Indian",
             "middle eastern": "Middle Eastern",
-            # cledoux42/Ethnicity_Test_v003 5-class labels → map into our schema
             "african": "Black",
             "asian": "East Asian",
             "caucasian": "White",
@@ -117,23 +161,52 @@ class DemographicAnalyzer:
         }
         return race_aliases.get(normalized, label)
-    @staticmethod
-    def _age_estimate_from_label(label: str) -> float:
-        mapping = {
-            "0-2": 1.0,
-            "3-9": 6.0,
-            "10-19": 14.5,
-            "20-29": 24.5,
-            "30-39": 34.5,
-            "40-49": 44.5,
-            "50-59": 54.5,
-            "60-69": 64.5,
-            "70+": 75.0,
-        }
-        return mapping.get(label, 0.0)
     @classmethod
     def _distribution_map(cls, predictions, normalizer, all_labels):
         distribution = {label: 0.0 for label in all_labels}
         for prediction in predictions:
             normalized_label = normalizer(prediction["label"])
@@ -143,6 +216,8 @@ class DemographicAnalyzer:
     @staticmethod
     def _safe_predict(classifier, image, top_k: int):
         if classifier is None:
             return []
         try:

 """
+DemographicAnalyzer — age, gender, ethnicity via three ViT classifiers.
+Models
+------
+- Age       : dima806/fairface_age_image_detection
+              ViT-B/16, ~59% top-1 on FairFace 9 age buckets.
+- Gender    : dima806/fairface_gender_image_detection
+              ViT-B/16, ~93.4% on FairFace.
+- Ethnicity : cledoux42/Ethnicity_Test_v003
+              ViT, 79.6% accuracy, macro-F1 0.797. 5-class output that
+              we widen into the legacy 7-bucket FairFace schema so the
+              rest of the app's distribution shape doesn't change.
+All three are Apache 2.0 and Hugging Face image-classification pipelines.
+Inputs
+------
+img_rgb : np.ndarray (H, W, 3) uint8
+Outputs (dict)
+--------------
+age_range, age_estimate (softmax-weighted continuous), age_confidence,
+age_distribution, gender, gender_confidence, ethnicity,
+ethnicity_confidence, ethnicity_distribution.
+Notes
+-----
+The FairFace age model is a 9-bucket classifier (0-2, 3-9, …, 70+),
+which means the argmax bucket midpoint is always one of nine fixed
+numbers (24.5 for 20-29, etc.). To recover a smooth continuous estimate
+we compute the expected value across the full softmax — see
+``_weighted_age_estimate``.
 """
 from typing import Any
 AGE_LABELS = ["0-2", "3-9", "10-19", "20-29", "30-39", "40-49", "50-59", "60-69", "70+"]
 GENDER_LABELS = ["Male", "Female"]
+# cledoux42 ships 5 classes (african, asian, caucasian, hispanic, indian),
+# but we keep the legacy 7-bucket FairFace label space internally so the
+# downstream distribution dict shape stays stable. Unseen buckets stay 0.
 RACE_LABELS = ["White", "Black", "Latino_Hispanic", "East Asian", "Southeast Asian", "Indian", "Middle Eastern"]
 class DemographicAnalyzer:
     def __init__(self):
+        # Each classifier is a HF image-classification pipeline. They lazy
+        # download weights from HF on first instantiation and cache them
+        # under /root/.cache/huggingface inside the container.
         self.age_classifier = self._load_classifier(AGE_MODEL_ID)
         self.gender_classifier = self._load_classifier(GENDER_MODEL_ID)
         self.race_classifier = self._load_classifier(RACE_MODEL_ID)
     @staticmethod
     def _load_classifier(model_id: str):
+        """Build one HF image-classification pipeline, logging on failure.
+        A failed load returns None so the rest of the service continues
+        to function and `analyze()` falls back to "unknown" demographics.
+        """
         try:
             return pipeline("image-classification", model=model_id)
         except Exception as exc:
             return None
     def analyze(self, img_rgb) -> dict[str, Any]:
+        # Convert the numpy frame to a PIL Image once and reuse it for
+        # all three classifier calls.
         pil = Image.fromarray(img_rgb)
+        # top_k=len(labels) so we get the full softmax for each model.
+        # We need the full age distribution to compute the weighted
+        # expected-value age estimate.
+        age_predictions = self._safe_predict(self.age_classifier, pil, top_k=len(AGE_LABELS))
         gender_predictions = self._safe_predict(self.gender_classifier, pil, top_k=2)
         race_predictions = self._safe_predict(self.race_classifier, pil, top_k=7)
+        # If every classifier failed we degrade gracefully with a stub.
         if not age_predictions and not gender_predictions and not race_predictions:
             return {
                 "age_range": "unknown",
                 "ethnicity_distribution": {label: 0.0 for label in RACE_LABELS},
             }
+        # HF pipelines return predictions pre-sorted by score descending,
+        # so prediction[0] is always the argmax class.
         age_prediction = age_predictions[0] if age_predictions else {"label": "unknown", "score": 0.0}
         gender_prediction = gender_predictions[0] if gender_predictions else {"label": "unknown", "score": 0.0}
         race_prediction = race_predictions[0] if race_predictions else {"label": "unknown", "score": 0.0}
+        # Models occasionally return label aliases ("more than 70" instead
+        # of "70+", "African" instead of "Black"). The normalisers map
+        # everything back to our canonical schema.
         age_label = self._normalize_age_label(age_prediction["label"])
         gender_label = self._normalize_gender_label(gender_prediction["label"])
         race_label = self._normalize_race_label(race_prediction["label"])
         return {
             "age_range": age_label,
+            "age_estimate": self._weighted_age_estimate(age_predictions),
             "age_confidence": round(float(age_prediction["score"]), 3),
             "gender": gender_label.lower(),
             "gender_confidence": round(float(gender_prediction["score"]), 3),
     @staticmethod
     def _normalize_age_label(label: str) -> str:
+        """Map model output to canonical AGE_LABELS entry."""
         normalized = label.strip().lower()
         if normalized == "more than 70":
             return "70+"
     @staticmethod
     def _normalize_race_label(label: str) -> str:
+        """Coalesce cledoux42's 5 classes into our 7-bucket schema."""
         normalized = label.strip().lower().replace("-", "_")
         race_aliases = {
+            # Legacy FairFace 7-class labels
             "white": "White",
             "black": "Black",
             "latino_hispanic": "Latino_Hispanic",
             "southeast asian": "Southeast Asian",
             "indian": "Indian",
             "middle eastern": "Middle Eastern",
+            # cledoux42/Ethnicity_Test_v003 5-class labels
             "african": "Black",
             "asian": "East Asian",
             "caucasian": "White",
         }
         return race_aliases.get(normalized, label)
+    # Midpoint of each FairFace age bucket — used as the per-bucket
+    # "value" when we marginalise over the predicted distribution.
+    _AGE_MIDPOINTS = {
+        "0-2": 1.0,
+        "3-9": 6.0,
+        "10-19": 14.5,
+        "20-29": 24.5,
+        "30-39": 34.5,
+        "40-49": 44.5,
+        "50-59": 54.5,
+        "60-69": 64.5,
+        "70+": 75.0,
+    }
+    @classmethod
+    def _weighted_age_estimate(cls, predictions: list[dict]) -> float:
+        """Softmax-weighted expected age across all FairFace buckets.
+        FairFace is a 9-bucket classifier; the argmax always snaps to one
+        of nine fixed midpoints (24.5 for 20-29, etc.). Treating its
+        softmax as a probability distribution and taking the expected
+        value gives a continuous number that moves with confidence
+        (23.1 for someone very confidently 20-29, 28.4 if some mass leaks
+        into 30-39). Still bounded by bucket midpoints — true per-year
+        accuracy would need a regression model.
+        """
+        total_weight = 0.0
+        weighted_sum = 0.0
+        for pred in predictions:
+            label = cls._normalize_age_label(pred["label"])
+            midpoint = cls._AGE_MIDPOINTS.get(label)
+            if midpoint is None:
+                continue
+            score = float(pred["score"])
+            weighted_sum += midpoint * score
+            total_weight += score
+        if total_weight == 0:
+            return 0.0
+        return round(weighted_sum / total_weight, 1)
     @classmethod
     def _distribution_map(cls, predictions, normalizer, all_labels):
+        """Flatten HF predictions into {canonical_label: score} dict.
+        Unseen labels stay at 0.0 so the shape is always all_labels-sized.
+        """
         distribution = {label: 0.0 for label in all_labels}
         for prediction in predictions:
             normalized_label = normalizer(prediction["label"])
     @staticmethod
     def _safe_predict(classifier, image, top_k: int):
+        """Wrap classifier(...) so a single model failure can't bring
+        down the whole demographic block."""
         if classifier is None:
             return []
         try:

analyzers/emotion_analyzer.py CHANGED Viewed

@@ -1,17 +1,38 @@
 """
-HSEmotion — EfficientNet-B0 fine-tuned for 8-class emotion recognition.
-Uses the published HSEmotion checkpoint (Savchenko et al., enet_b0_8_best_afew),
-which has actual fine-tuned weights for the 8 emotion classes. The previous
-version asked timm for a 1000-class ImageNet checkpoint and reset the head to
-8 randomly-initialized neurons, so the outputs were softmax-over-noise.
-Classes: Anger, Contempt, Disgust, Fear, Happiness, Neutral, Sadness, Surprise.
-Also provides valence (positive/negative) and arousal (calm/excited) scores
-derived from the emotion distribution.
-Install: pip install hsemotion
 """
 from contextlib import contextmanager
@@ -33,7 +54,8 @@ EMOTION_LABELS = [
     "happiness", "neutral", "sadness", "surprise",
 ]
-# Valence weights for each emotion (-1 to +1)
 VALENCE_MAP = {
     "anger": -0.6,
     "contempt": -0.3,
@@ -45,7 +67,7 @@ VALENCE_MAP = {
     "surprise": 0.3,
 }
-# Arousal weights for each emotion (0 to 1)
 AROUSAL_MAP = {
     "anger": 0.8,
     "contempt": 0.3,
@@ -62,15 +84,11 @@ HSEMOTION_MODEL_NAME = "enet_b0_8_best_afew"
 @contextmanager
 def _legacy_torch_load():
-    """Temporarily make torch.load default to weights_only=False.
-    PyTorch 2.6 changed the default to weights_only=True. The HSEmotion
-    checkpoint is pickled as a full timm.models.efficientnet.EfficientNet
-    object (not a clean state dict), so the safe unpickler refuses to
-    deserialize it. We trust this checkpoint (it comes from the published
-    HSEmotion repo and was already vetted by the pip install), so we opt
-    back into legacy loading — scoped to just the HSEmotion init so the
-    rest of the process keeps the safer default.
     """
     original_load = torch.load
@@ -91,6 +109,9 @@ class EmotionAnalyzer:
         self.recognizer = self._load_model()
     def _load_model(self):
         if not HAS_HSEMOTION:
             print(
                 "[EmotionAnalyzer] hsemotion not installed — emotion outputs "
@@ -114,12 +135,16 @@ class EmotionAnalyzer:
         try:
             # logits=False → returns post-softmax probabilities.
-            # HSEmotionRecognizer handles its own resize/normalize/preproc.
             _, scores = self.recognizer.predict_emotions(img_rgb, logits=False)
         except Exception as exc:
             print(f"[EmotionAnalyzer] Inference failed: {exc}")
             return self._empty_result()
         probs = np.asarray(scores, dtype=float).flatten()
         if probs.size != len(EMOTION_LABELS):
             print(
@@ -129,26 +154,31 @@ class EmotionAnalyzer:
             )
             return self._empty_result()
-        # Defensive renormalization. With logits=False this is a no-op, but
-        # guards against future API drift in the hsemotion package.
         total = probs.sum()
         if total > 0:
             probs = probs / total
         emotion_scores = {
             label: round(float(probs[i]), 3)
             for i, label in enumerate(EMOTION_LABELS)
         }
         primary_idx = int(np.argmax(probs))
         primary_emotion = EMOTION_LABELS[primary_idx]
         primary_confidence = float(probs[primary_idx])
-        # Secondary emotion (second highest)
         sorted_idx = np.argsort(probs)[::-1]
         secondary_emotion = EMOTION_LABELS[int(sorted_idx[1])]
-        # Calculate valence and arousal
         valence = sum(
             probs[i] * VALENCE_MAP[label]
             for i, label in enumerate(EMOTION_LABELS)
@@ -174,6 +204,7 @@ class EmotionAnalyzer:
     @staticmethod
     def _empty_result() -> dict[str, Any]:
         return {
             "primary_emotion": "unknown",
             "emotion_confidence": 0.0,
@@ -182,4 +213,4 @@ class EmotionAnalyzer:
             "valence": 0.0,
             "arousal": 0.0,
             "mood": "unknown",
-        }

 """
+EmotionAnalyzer — HSEmotion 8-class facial emotion recognition.
+Model
+-----
+- Architecture : EfficientNet-B0
+- Checkpoint   : enet_b0_8_best_afew (Savchenko et al.)
+                 published by the hsemotion PyPI package
+- Classes (8)  : anger, contempt, disgust, fear, happiness,
+                 neutral, sadness, surprise
+- License      : Apache 2.0 (hsemotion package)
+- Source       : https://github.com/HSE-asavchenko/face-emotion-recognition
+Inputs
+------
+img_rgb : np.ndarray (H, W, 3) uint8. HSEmotionRecognizer handles its
+own resize/normalise internally.
+Outputs (dict)
+--------------
+primary_emotion, emotion_confidence, secondary_emotion,
+emotion_scores (full distribution), valence (-1..+1), arousal (0..1),
+mood (positive | negative | neutral).
+Notes
+-----
+Valence and arousal are derived from the emotion distribution using
+hand-set per-emotion weights (VALENCE_MAP / AROUSAL_MAP) — they are
+weighted sums, not separate model outputs.
+PyTorch 2.6 changed torch.load to weights_only=True by default. The
+HSEmotion checkpoint is pickled as a full timm EfficientNet object
+(not a clean state dict), so the safe unpickler refuses to load it.
+We scope a legacy weights_only=False just around the HSEmotion init
+to keep the rest of the process on the safer default.
 """
 from contextlib import contextmanager
     "happiness", "neutral", "sadness", "surprise",
 ]
+# Per-emotion valence weights. Used to project the 8-class distribution
+# down to a single scalar in [-1, 1] (negative = sad/angry, positive = happy).
 VALENCE_MAP = {
     "anger": -0.6,
     "contempt": -0.3,
     "surprise": 0.3,
 }
+# Per-emotion arousal weights, scalar in [0, 1] (0 = calm, 1 = intense).
 AROUSAL_MAP = {
     "anger": 0.8,
     "contempt": 0.3,
 @contextmanager
 def _legacy_torch_load():
+    """Temporarily switch torch.load back to weights_only=False.
+    Scoped via a context manager so only the HSEmotion init runs with
+    the legacy default; everything else keeps PyTorch 2.6's safer
+    weights_only=True behaviour.
     """
     original_load = torch.load
         self.recognizer = self._load_model()
     def _load_model(self):
+        # Without the hsemotion package installed there's no model to
+        # load. We log once and the rest of the service still works —
+        # the emotion fields just stay "unknown".
         if not HAS_HSEMOTION:
             print(
                 "[EmotionAnalyzer] hsemotion not installed — emotion outputs "
         try:
             # logits=False → returns post-softmax probabilities.
+            # The recognizer handles its own resize/normalize/preproc,
+            # so we hand it the raw RGB ndarray.
             _, scores = self.recognizer.predict_emotions(img_rgb, logits=False)
         except Exception as exc:
             print(f"[EmotionAnalyzer] Inference failed: {exc}")
             return self._empty_result()
+        # Flatten to a 1D numpy array and sanity-check its length matches
+        # the class list. Mismatch likely means the upstream package
+        # changed its class count.
         probs = np.asarray(scores, dtype=float).flatten()
         if probs.size != len(EMOTION_LABELS):
             print(
             )
             return self._empty_result()
+        # Defensive renormalisation. With logits=False this is a no-op,
+        # but it guards against future API drift in the hsemotion package.
         total = probs.sum()
         if total > 0:
             probs = probs / total
+        # Build the {emotion: probability} dict for downstream display.
         emotion_scores = {
             label: round(float(probs[i]), 3)
             for i, label in enumerate(EMOTION_LABELS)
         }
+        # Primary = argmax of the distribution; secondary = second-highest.
+        # These are the two most-likely emotions, useful when the model
+        # is genuinely uncertain between two similar classes.
         primary_idx = int(np.argmax(probs))
         primary_emotion = EMOTION_LABELS[primary_idx]
         primary_confidence = float(probs[primary_idx])
         sorted_idx = np.argsort(probs)[::-1]
         secondary_emotion = EMOTION_LABELS[int(sorted_idx[1])]
+        # Valence and arousal: weighted sums over the distribution. A
+        # confidently-happy face gives valence ~0.9; a fearful one drops
+        # into negative territory with high arousal.
         valence = sum(
             probs[i] * VALENCE_MAP[label]
             for i, label in enumerate(EMOTION_LABELS)
     @staticmethod
     def _empty_result() -> dict[str, Any]:
+        """Stub used when HSEmotion isn't available or inference fails."""
         return {
             "primary_emotion": "unknown",
             "emotion_confidence": 0.0,
             "valence": 0.0,
             "arousal": 0.0,
             "mood": "unknown",
+        }

analyzers/hair_type_analyzer.py ADDED Viewed

	@@ -0,0 +1,87 @@

+"""
+HairTypeAnalyzer — hair texture classifier.
+Model
+-----
+- Architecture : Vision Transformer (ViT-B/16)
+- HF repo      : dima806/hair_type_image_detection
+- License      : Apache 2.0
+- Classes (5)  : curly, dreadlocks, kinky, straight, wavy
+- Reported acc : 93% overall.
+                 Per-class F1: dreadlocks 0.978, kinky 0.949,
+                 straight 0.927, curly 0.902, wavy 0.884.
+Inputs
+------
+img_rgb : np.ndarray (H, W, 3) uint8
+Outputs (dict)
+--------------
+hair_type            — argmax label
+hair_type_confidence — argmax softmax score
+hair_type_scores     — full {class: score} dict
+Notes
+-----
+This is the authoritative hair-texture output. The Laplacian-std-
+based `hair_texture` field from ColorAnalyzer is a coarse fallback
+that runs even when this model is unavailable.
+"""
+from typing import Any
+from PIL import Image
+from transformers import pipeline
+MODEL_ID = "dima806/hair_type_image_detection"
+# Canonical class names in lowercase. Pipeline output is normalised
+# to these on the way out.
+_KNOWN = {"curly", "dreadlocks", "kinky", "straight", "wavy"}
+class HairTypeAnalyzer:
+    def __init__(self):
+        self.classifier = None
+        try:
+            self.classifier = pipeline("image-classification", model=MODEL_ID)
+        except Exception as exc:
+            print(f"[HairTypeAnalyzer] Failed to load {MODEL_ID}: {exc}")
+    def analyze(self, img_rgb) -> dict[str, Any]:
+        if self.classifier is None:
+            return self._empty_result()
+        try:
+            pil = Image.fromarray(img_rgb)
+            # Pull all five class probabilities so downstream code can
+            # inspect the full distribution (e.g. wavy-vs-curly margin).
+            preds = self.classifier(pil, top_k=len(_KNOWN))
+        except Exception as exc:
+            print(f"[HairTypeAnalyzer] Prediction failed: {exc}")
+            return self._empty_result()
+        # Normalise label casing and build the score map.
+        scores = {label: 0.0 for label in _KNOWN}
+        for pred in preds:
+            label = str(pred["label"]).strip().lower()
+            if label in scores:
+                scores[label] = round(float(pred["score"]), 3)
+        top_label = max(scores, key=scores.get)
+        top_score = scores[top_label]
+        return {
+            "hair_type": top_label,
+            "hair_type_confidence": top_score,
+            "hair_type_scores": scores,
+        }
+    @staticmethod
+    def _empty_result() -> dict[str, Any]:
+        return {
+            "hair_type": "unknown",
+            "hair_type_confidence": 0.0,
+            "hair_type_scores": {label: 0.0 for label in _KNOWN},
+        }

analyzers/landmark_analyzer.py CHANGED Viewed

@@ -1,20 +1,43 @@
 """
-MediaPipe Face Landmarker — 478 3D landmarks + 52 blendshapes.
-Derives geometric facial features from landmark positions using pure math.
-This is the backbone of the system. From 478 3D points placed on the face,
-we calculate distances, ratios, and angles to determine:
-- Face shape (oval, round, square, heart, diamond, oblong, triangle)
-- Jawline type (sharp, soft, strong)
-- Chin type (pointed, wide, normal)
-- Cheekbone prominence
-- Forehead width
-- Eye shape, spacing, size, depth
-- Eyebrow shape, arch, thickness
-- Nose shape, bridge height, nostril width, tip shape
-- Lip fullness, mouth width, cupid's bow
-- Smile detection, asymmetry, dimples
-- Overall facial asymmetry score
 """
 import math
@@ -27,6 +50,7 @@ import numpy as np
 from mediapipe.tasks import python as mp_python
 from mediapipe.tasks.python import vision
 MODEL_URL = (
     "https://storage.googleapis.com/mediapipe-models/"
     "face_landmarker/face_landmarker/float16/latest/face_landmarker.task"
@@ -36,6 +60,9 @@ MODEL_PATH = "models/face_landmarker.task"
 class LandmarkAnalyzer:
     def __init__(self):
         base_options = mp_python.BaseOptions(
             model_asset_path=self._ensure_model()
         )
@@ -49,7 +76,7 @@ class LandmarkAnalyzer:
     @staticmethod
     def _ensure_model() -> str:
-        """Download the MediaPipe model if not already cached."""
         if not os.path.exists(MODEL_PATH):
             os.makedirs("models", exist_ok=True)
             urllib.request.urlretrieve(MODEL_URL, MODEL_PATH)
@@ -60,34 +87,48 @@ class LandmarkAnalyzer:
     # ------------------------------------------------------------------
     def analyze(self, img_rgb: np.ndarray) -> dict[str, Any]:
         mp_image = mp.Image(image_format=mp.ImageFormat.SRGB, data=img_rgb)
         result = self.detector.detect(mp_image)
         if not result.face_landmarks:
             return {"error": "No face detected by MediaPipe"}
         landmarks = result.face_landmarks[0]
         lm = [{"x": l.x, "y": l.y, "z": l.z} for l in landmarks]
         blendshapes: dict[str, float] = {}
         if result.face_blendshapes:
             for bs in result.face_blendshapes[0]:
                 blendshapes[bs.category_name] = round(bs.score, 4)
         attrs: dict[str, Any] = {"_raw_landmarks": lm}
         # ── Face Shape ────────────────────────────────────────────────
-        face_height = self._dist(lm[10], lm[152])
-        face_width = self._dist(lm[234], lm[454])
-        jaw_width = self._dist(lm[172], lm[397])
-        cheekbone_width = self._dist(lm[93], lm[323])
-        forehead_width = self._dist(lm[54], lm[284])
         wh_ratio = face_width / face_height if face_height else 1
         jaw_to_face = jaw_width / face_width if face_width else 1
         forehead_to_jaw = forehead_width / jaw_width if jaw_width else 1
         cheek_to_jaw = cheekbone_width / jaw_width if jaw_width else 1
         if wh_ratio > 0.85 and jaw_to_face > 0.75:
             attrs["face_shape"] = "round"
         elif wh_ratio > 0.8 and jaw_to_face > 0.8 and forehead_to_jaw < 1.1:
@@ -110,13 +151,16 @@ class LandmarkAnalyzer:
             "cheekbone_to_jaw_ratio": round(cheek_to_jaw, 3),
         }
-        # ── Forehead ─────────────────────────────────────────────────
         fh_ratio = forehead_width / face_width if face_width else 0.6
         attrs["forehead_width"] = (
             "broad" if fh_ratio > 0.7 else "narrow" if fh_ratio < 0.55 else "average"
         )
         # ── Jawline ──────────────────────────────────────────────────
         jaw_angle = self._jaw_angle(lm)
         attrs["jawline_angle"] = round(jaw_angle, 1)
         if jaw_angle < 110:
@@ -129,6 +173,7 @@ class LandmarkAnalyzer:
             attrs["jawline_type"] = "soft"
         # ── Chin ─────────────────────────────────────────────────────
         chin_width = self._dist(lm[175], lm[396])
         chin_ratio = chin_width / jaw_width if jaw_width else 0.4
         attrs["chin_type"] = (
@@ -138,12 +183,16 @@ class LandmarkAnalyzer:
         )
         # ── Cheekbones ───────────────────────────────────────────────
         cheek_z = (lm[93]["z"] + lm[323]["z"]) / 2
         attrs["cheekbone_prominence"] = (
             "high" if cheek_z < -0.04
             else "flat" if cheek_z > 0.0
             else "moderate"
         )
         cheek_puff = blendshapes.get("cheekPuff", 0)
         if cheek_puff > 0.3:
             attrs["cheek_fullness"] = "full"
@@ -153,12 +202,16 @@ class LandmarkAnalyzer:
             attrs["cheek_fullness"] = "normal"
         # ── Eyes ─────────────────────────────────────────────────────
         l_top, l_bot = lm[159], lm[145]
         l_inner, l_outer = lm[133], lm[33]
         eye_open = self._dist(l_top, l_bot)
         eye_w = self._dist(l_inner, l_outer)
         eye_ratio = eye_open / eye_w if eye_w else 0.3
         outer_angle = l_outer["y"] - l_inner["y"]
         if outer_angle < -0.012:
             attrs["eye_shape"] = "upturned"
@@ -171,7 +224,7 @@ class LandmarkAnalyzer:
         else:
             attrs["eye_shape"] = "almond"
-        # Deep-set vs protruding
         eye_z = (lm[159]["z"] + lm[145]["z"]) / 2
         nose_bridge_z = lm[6]["z"]
         if eye_z > nose_bridge_z + 0.02:
@@ -181,7 +234,8 @@ class LandmarkAnalyzer:
         else:
             attrs["eye_depth"] = "normal"
-        # Eye spacing
         if len(lm) > 473:
             inter_pupillary = self._dist(lm[468], lm[473])
         else:
@@ -193,7 +247,8 @@ class LandmarkAnalyzer:
             else "average"
         )
-        # Eye size
         r_top, r_bot = lm[386], lm[374]
         r_inner, r_outer = lm[362], lm[263]
         r_area = self._dist(r_top, r_bot) * self._dist(r_inner, r_outer)
@@ -207,6 +262,8 @@ class LandmarkAnalyzer:
             else "average"
         )
         blink_l = blendshapes.get("eyeBlinkLeft", 0)
         blink_r = blendshapes.get("eyeBlinkRight", 0)
         attrs["eyes_open"] = (blink_l + blink_r) / 2 < 0.5
@@ -215,6 +272,8 @@ class LandmarkAnalyzer:
         brow_mid = lm[105]
         brow_outer = lm[46]
         brow_inner = lm[70]
         brow_to_eye = self._dist(brow_mid, lm[159])
         brow_arch_ratio = brow_to_eye / eye_open if eye_open else 1.5
@@ -224,6 +283,8 @@ class LandmarkAnalyzer:
             else "average"
         )
         mid_y = brow_mid["y"]
         avg_end_y = (brow_inner["y"] + brow_outer["y"]) / 2
         curvature = mid_y - avg_end_y
@@ -234,6 +295,7 @@ class LandmarkAnalyzer:
         else:
             attrs["eyebrow_shape"] = "flat"
         brow_top = lm[66]
         brow_bottom = lm[105]
         brow_thickness = self._dist(brow_top, brow_bottom)
@@ -243,6 +305,7 @@ class LandmarkAnalyzer:
             else "medium"
         )
         inner_brow_dist = self._dist(lm[70], lm[300])
         attrs["possible_unibrow"] = inner_brow_dist < 0.04
@@ -261,6 +324,8 @@ class LandmarkAnalyzer:
             else "average"
         )
         tip_angle = nose_tip["y"] - nose_bottom["y"]
         if tip_angle < -0.005:
             attrs["nose_shape"] = "upturned"
@@ -273,16 +338,21 @@ class LandmarkAnalyzer:
         else:
             attrs["nose_shape"] = "straight"
         attrs["nose_bridge"] = (
             "high" if nose_bridge_top["z"] < -0.05
             else "flat" if nose_bridge_top["z"] > 0.0
             else "average"
         )
         attrs["nose_tip_shape"] = (
             "pointed" if nose_tip["z"] < nose_bottom["z"] - 0.01 else "rounded"
         )
         # ── Lips & Mouth ─────────────────────────────────────────────
         ul_top, ul_bot = lm[0], lm[13]
         ll_top, ll_bot = lm[14], lm[17]
         m_left, m_right = lm[61], lm[291]
@@ -298,6 +368,7 @@ class LandmarkAnalyzer:
             else "thin" if lip_ratio < 0.22
             else "average"
         )
         attrs["lip_balance"] = (
             "top-heavy" if ul_h > ll_h * 1.2
             else "bottom-heavy" if ll_h > ul_h * 1.2
@@ -311,7 +382,8 @@ class LandmarkAnalyzer:
             else "average"
         )
-        # Cupid's bow
         c_left, c_center, c_right = lm[37], lm[0], lm[267]
         bow = c_center["y"] - (c_left["y"] + c_right["y"]) / 2
         attrs["cupids_bow"] = (
@@ -320,7 +392,9 @@ class LandmarkAnalyzer:
             else "flat"
         )
-        # Smile
         smile_l = blendshapes.get("mouthSmileLeft", 0)
         smile_r = blendshapes.get("mouthSmileRight", 0)
         attrs["smiling"] = (smile_l + smile_r) / 2 > 0.4
@@ -330,6 +404,9 @@ class LandmarkAnalyzer:
         )
         # ── Facial Asymmetry ─────────────────────────────────────────
         pairs = [
             (33, 263), (133, 362), (70, 300), (93, 323), (172, 397),
             (61, 291), (159, 386), (145, 374), (46, 276),
@@ -341,6 +418,8 @@ class LandmarkAnalyzer:
             min(asym / len(pairs) / 0.05, 1.0), 3
         )
         attrs["blendshapes"] = blendshapes
         return attrs
@@ -350,6 +429,7 @@ class LandmarkAnalyzer:
     @staticmethod
     def _dist(a: dict, b: dict) -> float:
         return math.sqrt(
             (a["x"] - b["x"]) ** 2
             + (a["y"] - b["y"]) ** 2
@@ -358,6 +438,11 @@ class LandmarkAnalyzer:
     @staticmethod
     def _jaw_angle(lm: list[dict]) -> float:
         chin = lm[152]
         left_jaw, right_jaw = lm[172], lm[397]
         v1 = (left_jaw["x"] - chin["x"], left_jaw["y"] - chin["y"])

 """
+LandmarkAnalyzer — MediaPipe Face Landmarker geometric feature extractor.
+Model
+-----
+- Architecture : MediaPipe Face Landmarker (TF Lite, Google)
+- Weights      : face_landmarker.task (float16, auto-downloaded, ~4 MB)
+- Outputs      : 478 normalised 3D landmarks + 52 ARKit-compatible blendshapes
+- License      : Apache 2.0
+Inputs
+------
+img_rgb : np.ndarray (H, W, 3) uint8, RGB order.
+Outputs (dict)
+--------------
+Most fields are categorical strings derived from landmark distances,
+ratios and angles. A few come straight from blendshape activations.
+Face shape / structure :
+    face_shape, face_shape_metrics, forehead_width,
+    jawline_angle, jawline_type, chin_type,
+    cheekbone_prominence, cheek_fullness, facial_asymmetry_score
+Eyes :
+    eye_shape, eye_depth, eye_spacing, eye_size, eyes_open
+Eyebrows :
+    eyebrow_arch_height, eyebrow_shape, eyebrow_thickness, possible_unibrow
+Nose :
+    nose_shape, nose_bridge, nose_tip_shape, nostril_width
+Lips & mouth :
+    lip_fullness, lip_balance, mouth_width, cupids_bow,
+    smiling, smile_asymmetry, possible_dimples
+Raw payloads (used downstream, stripped before JSON) :
+    _raw_landmarks, blendshapes
+Notes
+-----
+All thresholds were hand-tuned against representative photos.
+They are conservative: when a ratio sits near a boundary the analyzer
+prefers "average" / "normal" over committing to an extreme bucket.
 """
 import math
 from mediapipe.tasks import python as mp_python
 from mediapipe.tasks.python import vision
+# Float16 MediaPipe weight file. ~4 MB, auto-fetched once and cached.
 MODEL_URL = (
     "https://storage.googleapis.com/mediapipe-models/"
     "face_landmarker/face_landmarker/float16/latest/face_landmarker.task"
 class LandmarkAnalyzer:
     def __init__(self):
+        # Configure the detector to emit both blendshapes and the 4x4
+        # facial transformation matrix; the latter is unused for now but
+        # cheap to compute and useful if we ever need head pose.
         base_options = mp_python.BaseOptions(
             model_asset_path=self._ensure_model()
         )
     @staticmethod
     def _ensure_model() -> str:
+        """Cache the MediaPipe weight file on disk on first run."""
         if not os.path.exists(MODEL_PATH):
             os.makedirs("models", exist_ok=True)
             urllib.request.urlretrieve(MODEL_URL, MODEL_PATH)
     # ------------------------------------------------------------------
     def analyze(self, img_rgb: np.ndarray) -> dict[str, Any]:
+        # Wrap the numpy array as a MediaPipe Image and run detection.
+        # If no face is found, downstream analyzers will see no landmarks
+        # and gracefully degrade to "unknown" fields.
         mp_image = mp.Image(image_format=mp.ImageFormat.SRGB, data=img_rgb)
         result = self.detector.detect(mp_image)
         if not result.face_landmarks:
             return {"error": "No face detected by MediaPipe"}
+        # MediaPipe returns landmarks as NamedTuples; convert to plain
+        # dicts so downstream code (and JSON serialisation) is simpler.
         landmarks = result.face_landmarks[0]
         lm = [{"x": l.x, "y": l.y, "z": l.z} for l in landmarks]
+        # Build the {blendshape_name: score} dict. ARKit-compatible names
+        # like mouthSmileLeft, eyeBlinkRight, jawOpen.
         blendshapes: dict[str, float] = {}
         if result.face_blendshapes:
             for bs in result.face_blendshapes[0]:
                 blendshapes[bs.category_name] = round(bs.score, 4)
+        # _raw_landmarks is consumed by ColorAnalyzer (iris + lip masks).
+        # The leading underscore tells app.py to strip it before JSON.
         attrs: dict[str, Any] = {"_raw_landmarks": lm}
         # ── Face Shape ────────────────────────────────────────────────
+        # Four ratios that, taken together, distinguish the seven canonical
+        # face shapes. All distances are in normalised image coordinates
+        # so the ratios are resolution-independent.
+        face_height = self._dist(lm[10], lm[152])      # forehead top → chin bottom
+        face_width = self._dist(lm[234], lm[454])      # left zygion → right zygion
+        jaw_width = self._dist(lm[172], lm[397])       # left gonion → right gonion
+        cheekbone_width = self._dist(lm[93], lm[323])  # left zygomatic → right
+        forehead_width = self._dist(lm[54], lm[284])   # left frontal → right frontal
         wh_ratio = face_width / face_height if face_height else 1
         jaw_to_face = jaw_width / face_width if face_width else 1
         forehead_to_jaw = forehead_width / jaw_width if jaw_width else 1
         cheek_to_jaw = cheekbone_width / jaw_width if jaw_width else 1
+        # Cascade ordered by specificity — a face that matches multiple
+        # categories is bucketed by the strictest matching rule.
         if wh_ratio > 0.85 and jaw_to_face > 0.75:
             attrs["face_shape"] = "round"
         elif wh_ratio > 0.8 and jaw_to_face > 0.8 and forehead_to_jaw < 1.1:
             "cheekbone_to_jaw_ratio": round(cheek_to_jaw, 3),
         }
+        # ── Forehead width (broad / average / narrow) ────────────────
+        # Forehead width relative to overall face width.
         fh_ratio = forehead_width / face_width if face_width else 0.6
         attrs["forehead_width"] = (
             "broad" if fh_ratio > 0.7 else "narrow" if fh_ratio < 0.55 else "average"
         )
         # ── Jawline ──────────────────────────────────────────────────
+        # Angle subtended at the chin point by the two gonion landmarks.
+        # Smaller angle = sharper jawline; larger = softer.
         jaw_angle = self._jaw_angle(lm)
         attrs["jawline_angle"] = round(jaw_angle, 1)
         if jaw_angle < 110:
             attrs["jawline_type"] = "soft"
         # ── Chin ─────────────────────────────────────────────────────
+        # Chin width vs jaw width: narrower chin → pointier appearance.
         chin_width = self._dist(lm[175], lm[396])
         chin_ratio = chin_width / jaw_width if jaw_width else 0.4
         attrs["chin_type"] = (
         )
         # ── Cheekbones ───────────────────────────────────────────────
+        # Z (depth) is signed: negative values are closer to the camera.
+        # Prominent cheekbones project forward → more negative cheek_z.
         cheek_z = (lm[93]["z"] + lm[323]["z"]) / 2
         attrs["cheekbone_prominence"] = (
             "high" if cheek_z < -0.04
             else "flat" if cheek_z > 0.0
             else "moderate"
         )
+        # cheekPuff blendshape catches actively puffed-out cheeks; a flat
+        # cheek_z signals a hollow look in the absence of puff.
         cheek_puff = blendshapes.get("cheekPuff", 0)
         if cheek_puff > 0.3:
             attrs["cheek_fullness"] = "full"
             attrs["cheek_fullness"] = "normal"
         # ── Eyes ─────────────────────────────────────────────────────
+        # Left-eye landmarks. eye_open is vertical lid distance,
+        # eye_w is the inner→outer corner distance.
         l_top, l_bot = lm[159], lm[145]
         l_inner, l_outer = lm[133], lm[33]
         eye_open = self._dist(l_top, l_bot)
         eye_w = self._dist(l_inner, l_outer)
         eye_ratio = eye_open / eye_w if eye_w else 0.3
+        # Outer-corner Y relative to inner corner classifies tilt.
+        # Hooded vs round vs almond come from the openness ratio.
         outer_angle = l_outer["y"] - l_inner["y"]
         if outer_angle < -0.012:
             attrs["eye_shape"] = "upturned"
         else:
             attrs["eye_shape"] = "almond"
+        # Deep-set vs protruding: compare eye-region z vs nose-bridge z.
         eye_z = (lm[159]["z"] + lm[145]["z"]) / 2
         nose_bridge_z = lm[6]["z"]
         if eye_z > nose_bridge_z + 0.02:
         else:
             attrs["eye_depth"] = "normal"
+        # Eye spacing: prefer pupil-to-pupil if iris landmarks (468/473)
+        # are present, otherwise fall back to inner-corner distance.
         if len(lm) > 473:
             inter_pupillary = self._dist(lm[468], lm[473])
         else:
             else "average"
         )
+        # Eye size: avg of left & right eye-region bounding-box area,
+        # relative to overall face area.
         r_top, r_bot = lm[386], lm[374]
         r_inner, r_outer = lm[362], lm[263]
         r_area = self._dist(r_top, r_bot) * self._dist(r_inner, r_outer)
             else "average"
         )
+        # eyeBlink blendshapes flip to ~1.0 when the eye is closed.
+        # eyes_open = True iff average blink activation is < 0.5.
         blink_l = blendshapes.get("eyeBlinkLeft", 0)
         blink_r = blendshapes.get("eyeBlinkRight", 0)
         attrs["eyes_open"] = (blink_l + blink_r) / 2 < 0.5
         brow_mid = lm[105]
         brow_outer = lm[46]
         brow_inner = lm[70]
+        # Vertical distance from brow-mid to upper-eyelid is roughly
+        # proportional to perceived "arch height" relative to eye size.
         brow_to_eye = self._dist(brow_mid, lm[159])
         brow_arch_ratio = brow_to_eye / eye_open if eye_open else 1.5
             else "average"
         )
+        # Curvature = mid Y vs avg of inner+outer Ys. Negative curvature
+        # (mid sits higher than the ends) → arched; near-zero → straight.
         mid_y = brow_mid["y"]
         avg_end_y = (brow_inner["y"] + brow_outer["y"]) / 2
         curvature = mid_y - avg_end_y
         else:
             attrs["eyebrow_shape"] = "flat"
+        # Brow thickness from top-to-bottom landmark span.
         brow_top = lm[66]
         brow_bottom = lm[105]
         brow_thickness = self._dist(brow_top, brow_bottom)
             else "medium"
         )
+        # Inner-brow distance below ~4% of face width suggests a unibrow.
         inner_brow_dist = self._dist(lm[70], lm[300])
         attrs["possible_unibrow"] = inner_brow_dist < 0.04
             else "average"
         )
+        # Tip vertical offset relative to nose base distinguishes
+        # upturned (tip sits higher) from aquiline (tip droops down).
         tip_angle = nose_tip["y"] - nose_bottom["y"]
         if tip_angle < -0.005:
             attrs["nose_shape"] = "upturned"
         else:
             attrs["nose_shape"] = "straight"
+        # Bridge: high bridges project toward camera (more negative z).
         attrs["nose_bridge"] = (
             "high" if nose_bridge_top["z"] < -0.05
             else "flat" if nose_bridge_top["z"] > 0.0
             else "average"
         )
+        # Pointed tip: tip projects forward of nostril base.
         attrs["nose_tip_shape"] = (
             "pointed" if nose_tip["z"] < nose_bottom["z"] - 0.01 else "rounded"
         )
         # ── Lips & Mouth ─────────────────────────────────────────────
+        # Top and bottom of upper lip, top and bottom of lower lip, plus
+        # the mouth corners. lip_ratio compares stacked lip height to
+        # mouth width — full vs thin lips.
         ul_top, ul_bot = lm[0], lm[13]
         ll_top, ll_bot = lm[14], lm[17]
         m_left, m_right = lm[61], lm[291]
             else "thin" if lip_ratio < 0.22
             else "average"
         )
+        # Balance compares upper-lip thickness to lower-lip thickness.
         attrs["lip_balance"] = (
             "top-heavy" if ul_h > ll_h * 1.2
             else "bottom-heavy" if ll_h > ul_h * 1.2
             else "average"
         )
+        # Cupid's bow: depression at the centre of the upper lip relative
+        # to the two peak landmarks on either side.
         c_left, c_center, c_right = lm[37], lm[0], lm[267]
         bow = c_center["y"] - (c_left["y"] + c_right["y"]) / 2
         attrs["cupids_bow"] = (
             else "flat"
         )
+        # Smiling and dimples come directly from blendshape activations.
+        # smile_asymmetry is the absolute difference between left/right
+        # mouthSmile scores — non-zero on lopsided smiles.
         smile_l = blendshapes.get("mouthSmileLeft", 0)
         smile_r = blendshapes.get("mouthSmileRight", 0)
         attrs["smiling"] = (smile_l + smile_r) / 2 > 0.4
         )
         # ── Facial Asymmetry ─────────────────────────────────────────
+        # Sum mirror-pair x-coordinate offsets from the midline (x=0.5)
+        # over 9 paired landmarks. Normalise so a perfectly symmetric
+        # face scores ~0 and visibly asymmetric ones approach 1.
         pairs = [
             (33, 263), (133, 362), (70, 300), (93, 323), (172, 397),
             (61, 291), (159, 386), (145, 374), (46, 276),
             min(asym / len(pairs) / 0.05, 1.0), 3
         )
+        # Exposed for downstream consumers (e.g. the screen reads
+        # blendshapes.jawOpen to compute mouth_open).
         attrs["blendshapes"] = blendshapes
         return attrs
     @staticmethod
     def _dist(a: dict, b: dict) -> float:
+        """Euclidean distance between two landmarks in 3D space."""
         return math.sqrt(
             (a["x"] - b["x"]) ** 2
             + (a["y"] - b["y"]) ** 2
     @staticmethod
     def _jaw_angle(lm: list[dict]) -> float:
+        """Angle (degrees) subtended at the chin by the two gonion points.
+        Operates in 2D image space — z is intentionally ignored so the
+        angle reflects what the camera sees, not the underlying anatomy.
+        """
         chin = lm[152]
         left_jaw, right_jaw = lm[172], lm[397]
         v1 = (left_jaw["x"] - chin["x"], left_jaw["y"] - chin["y"])

analyzers/obstruction_analyzer.py ADDED Viewed

	@@ -0,0 +1,108 @@

+"""
+ObstructionAnalyzer — face obstruction classifier.
+Model
+-----
+- Architecture : Vision Transformer (ViT-B/16)
+- HF repo      : dima806/face_obstruction_image_detection
+- License      : Apache 2.0
+- Classes (6)  : sunglasses, glasses, mask, hand, other, none
+- Reported acc : ~91% overall.
+                 99.7% / 99.85% precision/recall on sunglasses
+                 99.0% / 99.7%  precision/recall on glasses
+                 99.7% / 99.85% precision/recall on mask
+                 Hand and "other" are much weaker (~71-75%); we don't
+                 surface those as booleans.
+Inputs
+------
+img_rgb : np.ndarray (H, W, 3) uint8
+Outputs (dict)
+--------------
+obstruction_top        — argmax label
+obstruction_confidence — argmax softmax score
+obstruction_scores     — full {class: score} dict
+wearing_glasses        — bool (true when glasses OR sunglasses > 0.5)
+wearing_sunglasses     — bool
+wearing_mask           — bool
+Notes
+-----
+Same author as the FairFace age/gender models already in
+DemographicAnalyzer. Built specifically for the glasses/sunglasses/mask
+case, which is why precision/recall on those three classes is so high.
+"""
+from typing import Any
+from PIL import Image
+from transformers import pipeline
+MODEL_ID = "dima806/face_obstruction_image_detection"
+# Canonical labels in lowercase. The pipeline may return any casing —
+# we normalise on the way out so downstream code keys consistently.
+_KNOWN = {"sunglasses", "glasses", "mask", "hand", "other", "none"}
+class ObstructionAnalyzer:
+    def __init__(self):
+        self.classifier = None
+        try:
+            # HF image-classification pipeline. Weights lazy-load from
+            # the Hub on first instantiation and cache locally.
+            self.classifier = pipeline("image-classification", model=MODEL_ID)
+        except Exception as exc:
+            print(f"[ObstructionAnalyzer] Failed to load {MODEL_ID}: {exc}")
+    def analyze(self, img_rgb) -> dict[str, Any]:
+        # Empty stub when the model failed to load — keeps the result
+        # dict shape stable so the merge in app.py never sees missing keys.
+        if self.classifier is None:
+            return self._empty_result()
+        try:
+            pil = Image.fromarray(img_rgb)
+            # top_k=len(_KNOWN) → full softmax across all six classes.
+            preds = self.classifier(pil, top_k=len(_KNOWN))
+        except Exception as exc:
+            print(f"[ObstructionAnalyzer] Prediction failed: {exc}")
+            return self._empty_result()
+        # Flatten predictions into a {label: score} dict, normalising
+        # label casing as we go. Unseen labels stay at 0.
+        scores = {label: 0.0 for label in _KNOWN}
+        for pred in preds:
+            label = str(pred["label"]).strip().lower()
+            if label in scores:
+                scores[label] = round(float(pred["score"]), 3)
+        # Top class wins.
+        top_label = max(scores, key=scores.get)
+        top_score = scores[top_label]
+        return {
+            "obstruction_top": top_label,
+            "obstruction_confidence": top_score,
+            "obstruction_scores": scores,
+            # Specific boolean flags the UI consumes directly.
+            # `wearing_glasses` is True for any kind of eyewear — the
+            # caller can branch on `wearing_sunglasses` if it cares
+            # about tinted vs clear lenses.
+            "wearing_glasses": scores["glasses"] > 0.5 or scores["sunglasses"] > 0.5,
+            "wearing_sunglasses": scores["sunglasses"] > 0.5,
+            "wearing_mask": scores["mask"] > 0.5,
+        }
+    @staticmethod
+    def _empty_result() -> dict[str, Any]:
+        return {
+            "obstruction_top": "unknown",
+            "obstruction_confidence": 0.0,
+            "obstruction_scores": {label: 0.0 for label in _KNOWN},
+            "wearing_glasses": False,
+            "wearing_sunglasses": False,
+            "wearing_mask": False,
+        }

analyzers/parsing_analyzer.py CHANGED Viewed

@@ -1,27 +1,42 @@
 """
-SegFormer-B5 human parsing — replaces the old jonathandinu/face-parsing loader.
-Model: matei-dorian/segformer-b5-finetuned-human-parsing
-  - Architecture: SegFormer-B5 (nvidia/mit-b5 backbone)
-  - Published metrics on its eval set:
-      • Mean IoU:       0.6258
-      • Mean accuracy:  0.7547
-      • Overall acc.:   0.8256
-      • Face:  acc 0.9094 / IoU 0.8294
-      • Hair:  acc 0.8974 / IoU 0.8171
-  - Outputs 18 classes (background, hat, hair, sunglasses, upper-clothes, skirt,
-    pants, dress, belt, left-shoe, right-shoe, face, left-leg, right-leg,
-    left-arm, right-arm, bag, scarf).
-We keep the same downstream contract as before: skin/hair/lip masks plus
-hair-length, accessory flags, wrinkle estimation, freckle/mole detection.
-The lip mask is approximated from the face region (no lip-specific class)
-and is mainly used as a fallback — MediaPipe lip landmarks are still the
-primary source for lip geometry/color in color_analyzer.
 """
 from typing import Any
-import warnings
 import cv2
 import numpy as np
@@ -34,7 +49,8 @@ from transformers import (
 MODEL_ID = "matei-dorian/segformer-b5-finetuned-human-parsing"
-# Official label map from the model card.
 PARSING_LABELS = {
     0: "background",
     1: "hat",
@@ -59,10 +75,14 @@ PARSING_LABELS = {
 class ParsingAnalyzer:
     def __init__(self):
         self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
         self.processor = None
         self.model = None
         try:
             self.processor = SegformerImageProcessor.from_pretrained(MODEL_ID)
             self.model = SegformerForSemanticSegmentation.from_pretrained(MODEL_ID)
             self.model.to(self.device).eval()
@@ -72,24 +92,36 @@ class ParsingAnalyzer:
     def analyze(self, img_rgb: np.ndarray) -> dict[str, Any]:
         h, w = img_rgb.shape[:2]
         if self.model is None or self.processor is None:
             return self._empty_result(h, w)
         pil = Image.fromarray(img_rgb)
         inputs = self.processor(images=pil, return_tensors="pt").to(self.device)
         with torch.no_grad():
             logits = self.model(**inputs).logits  # (1, C, H/4, W/4)
         upsampled = torch.nn.functional.interpolate(
             logits, size=(h, w), mode="bilinear", align_corners=False
         )
         parsing = upsampled.argmax(dim=1)[0].cpu().numpy().astype(np.uint8)
         masks: dict[str, np.ndarray] = {
             name: (parsing == label_id) for label_id, name in PARSING_LABELS.items()
         }
         total_pixels = h * w
         region_coverage = {
             name: round(float(mask.sum()) / total_pixels, 4)
@@ -99,16 +131,16 @@ class ParsingAnalyzer:
         result: dict[str, Any] = {"region_coverage": region_coverage}
         skin_mask = masks.get("face", np.zeros((h, w), dtype=bool))
         hair_mask = masks.get("hair", np.zeros((h, w), dtype=bool))
-        # No dedicated lip class; color_analyzer falls back to landmarks for lips.
-        lip_mask = np.zeros((h, w), dtype=bool)
         result["_skin_mask"] = skin_mask
         result["_hair_mask"] = hair_mask
-        result["_lip_mask"] = lip_mask
         # ── Hair length estimation ───────────────────────────────────
         hair_pixels = int(hair_mask.sum())
         face_pixels = int(skin_mask.sum()) + hair_pixels
         hair_ratio = hair_pixels / face_pixels if face_pixels else 0
@@ -124,18 +156,22 @@ class ParsingAnalyzer:
         result["hair_present"] = hair_ratio > 0.03
-        # ── Accessories from segmentation ────────────────────────────
-        result["glasses_detected"] = region_coverage.get("sunglasses", 0) > 0.005
         result["hat_detected"] = region_coverage.get("hat", 0) > 0.01
-        result["earring_detected"] = False  # no earring class in this model
-        result["necklace_detected"] = False  # no necklace class in this model
-        # ── Skin analysis on face mask ───────────────────────────────
         if skin_mask.sum() > 100:
             skin_gray = cv2.cvtColor(img_rgb, cv2.COLOR_RGB2GRAY)
             laplacian = cv2.Laplacian(skin_gray, cv2.CV_64F)
             skin_edges = np.abs(laplacian)
-            skin_edges[~skin_mask] = 0
             edge_density = skin_edges.sum() / skin_mask.sum() if skin_mask.sum() else 0
             if edge_density > 15:
@@ -149,6 +185,10 @@ class ParsingAnalyzer:
             result["skin_texture_score"] = round(float(edge_density), 2)
             skin_lab = cv2.cvtColor(img_rgb, cv2.COLOR_RGB2LAB)
             l_channel = skin_lab[:, :, 0].astype(float)
             l_channel[~skin_mask] = np.nan
@@ -162,6 +202,8 @@ class ParsingAnalyzer:
                 else "none"
             )
             skin_l_values = l_channel[skin_mask]
             result["skin_uniformity"] = round(float(np.nanstd(skin_l_values)), 2)
         else:
@@ -174,18 +216,19 @@ class ParsingAnalyzer:
     @staticmethod
     def _empty_result(h: int, w: int) -> dict[str, Any]:
         empty = np.zeros((h, w), dtype=bool)
         return {
             "region_coverage": {},
             "_skin_mask": empty,
             "_hair_mask": empty,
-            "_lip_mask": empty,
             "hair_length": "unknown",
             "hair_present": False,
-            "glasses_detected": False,
             "hat_detected": False,
-            "earring_detected": False,
-            "necklace_detected": False,
             "wrinkle_level": "unknown",
             "skin_texture_score": 0,
             "freckles_or_moles": "unknown",

 """
+ParsingAnalyzer — SegFormer-B5 human parsing for masks and skin stats.
+Model
+-----
+- Architecture : SegFormer-B5 (nvidia/mit-b5 backbone)
+- HF repo      : matei-dorian/segformer-b5-finetuned-human-parsing
+- License      : Apache 2.0
+- Eval metrics : mean IoU 0.626, overall acc 0.826
+                 face acc 0.909 / IoU 0.829
+                 hair acc 0.897 / IoU 0.817
+- Classes (18) : background, hat, hair, sunglasses, upper_clothes, skirt,
+                 pants, dress, belt, left_shoe, right_shoe, face,
+                 left_leg, right_leg, left_arm, right_arm, bag, scarf
+Inputs
+------
+img_rgb : np.ndarray (H, W, 3) uint8
+Outputs (dict)
+--------------
+Internal masks (stripped from JSON):
+    _skin_mask, _hair_mask
+Public fields:
+    region_coverage  — per-class fraction of pixels
+    hair_length      — bald/very short | short | medium | long
+    hair_present     — bool
+    hat_detected     — bool, true when ≥1% of pixels are class "hat"
+    wrinkle_level    — smooth | slight | moderate | prominent
+    skin_texture_score, skin_uniformity, freckles_or_moles
+Notes
+-----
+The wrinkle / texture / freckle fields are OpenCV statistics computed
+over the SegFormer face mask, not direct model outputs. SegFormer
+contributes the mask; OpenCV does the per-pixel math.
 """
 from typing import Any
 import cv2
 import numpy as np
 MODEL_ID = "matei-dorian/segformer-b5-finetuned-human-parsing"
+# Class id → name as published by the model card. We index masks by
+# these names downstream rather than raw integer ids.
 PARSING_LABELS = {
     0: "background",
     1: "hat",
 class ParsingAnalyzer:
     def __init__(self):
+        # CUDA when available, CPU otherwise. The HF Spaces free tier is
+        # CPU-only, so SegFormer-B5 inference takes ~1-2 s per request.
         self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
         self.processor = None
         self.model = None
         try:
+            # Both processor and model weights come from the same repo;
+            # processor handles resize/normalize/tensorize.
             self.processor = SegformerImageProcessor.from_pretrained(MODEL_ID)
             self.model = SegformerForSemanticSegmentation.from_pretrained(MODEL_ID)
             self.model.to(self.device).eval()
     def analyze(self, img_rgb: np.ndarray) -> dict[str, Any]:
         h, w = img_rgb.shape[:2]
+        # If the model failed to load we return empty masks so the rest
+        # of the pipeline (especially ColorAnalyzer) sees a consistent
+        # shape and degrades cleanly to "unknown" fields.
         if self.model is None or self.processor is None:
             return self._empty_result(h, w)
+        # SegFormer expects PIL; processor will resize internally.
         pil = Image.fromarray(img_rgb)
         inputs = self.processor(images=pil, return_tensors="pt").to(self.device)
+        # Forward pass → logits at H/4 × W/4 resolution.
         with torch.no_grad():
             logits = self.model(**inputs).logits  # (1, C, H/4, W/4)
+        # Upsample to original resolution, then argmax to get the
+        # class id per pixel.
         upsampled = torch.nn.functional.interpolate(
             logits, size=(h, w), mode="bilinear", align_corners=False
         )
         parsing = upsampled.argmax(dim=1)[0].cpu().numpy().astype(np.uint8)
+        # Build a boolean mask per class. Cheap because we already have
+        # the argmax map; each is one numpy equality check.
         masks: dict[str, np.ndarray] = {
             name: (parsing == label_id) for label_id, name in PARSING_LABELS.items()
         }
+        # region_coverage = fraction of image occupied by each class.
+        # Useful as a coarse "is this class even present" signal — e.g.
+        # hat detection just checks if hat coverage exceeds a threshold.
         total_pixels = h * w
         region_coverage = {
             name: round(float(mask.sum()) / total_pixels, 4)
         result: dict[str, Any] = {"region_coverage": region_coverage}
+        # Skin & hair masks are passed downstream to ColorAnalyzer.
+        # Leading underscore → stripped from the final JSON payload.
         skin_mask = masks.get("face", np.zeros((h, w), dtype=bool))
         hair_mask = masks.get("hair", np.zeros((h, w), dtype=bool))
         result["_skin_mask"] = skin_mask
         result["_hair_mask"] = hair_mask
         # ── Hair length estimation ───────────────────────────────────
+        # Ratio of hair pixels to (face + hair) pixels — bigger ratio
+        # means longer hair extending past the face.
         hair_pixels = int(hair_mask.sum())
         face_pixels = int(skin_mask.sum()) + hair_pixels
         hair_ratio = hair_pixels / face_pixels if face_pixels else 0
         result["hair_present"] = hair_ratio > 0.03
+        # ── Hat detection ────────────────────────────────────────────
+        # A real hat consistently covers >1% of pixels; below that we're
+        # in noise / mis-segmentation territory.
         result["hat_detected"] = region_coverage.get("hat", 0) > 0.01
+        # ── Skin texture / wrinkles / freckles ───────────────────────
+        # Only worth computing if the face mask actually has substance.
+        # Under ~100 pixels we don't have enough signal.
         if skin_mask.sum() > 100:
+            # Wrinkles → high-frequency edge energy on the face mask.
+            # Laplacian responds to local intensity curvature; std/mean
+            # over the masked region gives a "how much fine detail" score.
             skin_gray = cv2.cvtColor(img_rgb, cv2.COLOR_RGB2GRAY)
             laplacian = cv2.Laplacian(skin_gray, cv2.CV_64F)
             skin_edges = np.abs(laplacian)
+            skin_edges[~skin_mask] = 0  # zero out non-face pixels
             edge_density = skin_edges.sum() / skin_mask.sum() if skin_mask.sum() else 0
             if edge_density > 15:
             result["skin_texture_score"] = round(float(edge_density), 2)
+            # Freckles/moles → count pixels well below mean L* lightness.
+            # Working in LAB rather than RGB makes the threshold tone-
+            # independent (a freckle is "darker than surrounding skin"
+            # regardless of base skin tone).
             skin_lab = cv2.cvtColor(img_rgb, cv2.COLOR_RGB2LAB)
             l_channel = skin_lab[:, :, 0].astype(float)
             l_channel[~skin_mask] = np.nan
                 else "none"
             )
+            # Uniformity = std-dev of L* over the face. Higher = more
+            # variation (uneven skin tone, shadows, scarring).
             skin_l_values = l_channel[skin_mask]
             result["skin_uniformity"] = round(float(np.nanstd(skin_l_values)), 2)
         else:
     @staticmethod
     def _empty_result(h: int, w: int) -> dict[str, Any]:
+        """Stub returned when the SegFormer model fails to load.
+        Shape must match the success path so downstream code can rely
+        on key presence without conditional checks.
+        """
         empty = np.zeros((h, w), dtype=bool)
         return {
             "region_coverage": {},
             "_skin_mask": empty,
             "_hair_mask": empty,
             "hair_length": "unknown",
             "hair_present": False,
             "hat_detected": False,
             "wrinkle_level": "unknown",
             "skin_texture_score": 0,
             "freckles_or_moles": "unknown",

app.py CHANGED Viewed

@@ -1,25 +1,63 @@
 """
-Face Analysis Microservice
-Combines multiple pretrained models for comprehensive facial attribute detection.
-Models used:
-1. MediaPipe Face Landmarker — 478 3D landmarks + 52 blendshapes → geometric features
-2. FairFace — age, gender, race classification
-3. CelebA Attribute Classifier — 40 binary facial attributes
-4. BiSeNet Face Parsing — 19-class pixel-level segmentation
-5. HSEmotion — 8-class emotion recognition
-6. Color Analyzer — pixel-level skin tone, eye color, hair color (no AI)
 """
 import os
 os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"
-os.environ["HF_HUB_DOWNLOAD_TIMEOUT"] = "60"  # default is 10s, bump it
 import io
 import logging
 from typing import Optional
-import cv2
 import numpy as np
 from fastapi import FastAPI, File, HTTPException, UploadFile
 from fastapi.middleware.cors import CORSMiddleware
@@ -27,10 +65,11 @@ from PIL import Image
 from analyzers.landmark_analyzer import LandmarkAnalyzer
 from analyzers.demographic_analyzer import DemographicAnalyzer
-from analyzers.attribute_analyzer import AttributeAnalyzer
 from analyzers.parsing_analyzer import ParsingAnalyzer
 from analyzers.emotion_analyzer import EmotionAnalyzer
 from analyzers.color_analyzer import ColorAnalyzer
 logging.basicConfig(level=logging.INFO)
 logger = logging.getLogger(__name__)
@@ -48,15 +87,22 @@ app.add_middleware(
 # Analyzers are initialized lazily on first request to reduce cold-start time
 landmark_analyzer: Optional[LandmarkAnalyzer] = None
 demographic_analyzer: Optional[DemographicAnalyzer] = None
-attribute_analyzer: Optional[AttributeAnalyzer] = None
 parsing_analyzer: Optional[ParsingAnalyzer] = None
 emotion_analyzer: Optional[EmotionAnalyzer] = None
 color_analyzer: Optional[ColorAnalyzer] = None
 def _to_json_safe(value):
-    """Convert numpy scalars/arrays and nested structures into JSON-safe types."""
-    # Handle numpy types first (before dict/list checks)
     if isinstance(value, (np.ndarray,)):
         return value.tolist()
     if isinstance(value, (np.integer, np.floating)):
@@ -65,7 +111,7 @@ def _to_json_safe(value):
         return bool(value)
     if isinstance(value, np.generic):
         return value.item()
-    # Handle nested structures
     if isinstance(value, dict):
         return {str(k): _to_json_safe(v) for k, v in value.items()}
     if isinstance(value, (list, tuple, set)):
@@ -74,9 +120,15 @@ def _to_json_safe(value):
 def get_analyzers():
-    """Lazy-load all analyzer models on first use."""
-    global landmark_analyzer, demographic_analyzer, attribute_analyzer
     global parsing_analyzer, emotion_analyzer, color_analyzer
     if landmark_analyzer is None:
         logger.info("Loading MediaPipe Face Landmarker...")
@@ -86,12 +138,8 @@ def get_analyzers():
         logger.info("Loading FairFace demographics model...")
         demographic_analyzer = DemographicAnalyzer()
-    if attribute_analyzer is None:
-        logger.info("Loading CelebA attribute classifier...")
-        attribute_analyzer = AttributeAnalyzer()
     if parsing_analyzer is None:
-        logger.info("Loading BiSeNet face parser...")
         parsing_analyzer = ParsingAnalyzer()
     if emotion_analyzer is None:
@@ -101,88 +149,96 @@ def get_analyzers():
     if color_analyzer is None:
         color_analyzer = ColorAnalyzer()
     return (
         landmark_analyzer,
         demographic_analyzer,
-        attribute_analyzer,
         parsing_analyzer,
         emotion_analyzer,
         color_analyzer,
     )
 @app.get("/")
 async def root():
-    """Root endpoint — returns API information."""
     return {
         "name": "HCP Face Analysis Service",
         "version": "2.0.0",
         "status": "running",
         "endpoints": {
             "health": "/health",
-            "analyze": "/analyze"
         }
     }
 @app.get("/health")
 async def health():
-    """Health check endpoint — use to keep the service warm."""
     return {"status": "ok"}
 @app.post("/analyze")
 async def analyze_face(file: UploadFile = File(...)):
-    """
-    Comprehensive face analysis endpoint.
-    Accepts an image file upload and returns ~100+ facial attributes
-    by running 6 models/analyzers in sequence.
     """
     try:
-        # Read and decode the uploaded image
         contents = await file.read()
         image = Image.open(io.BytesIO(contents)).convert("RGB")
         img_array = np.array(image)
-        img_bgr = cv2.cvtColor(img_array, cv2.COLOR_RGB2BGR)
         (
             landmarks,
             demographics,
-            attributes,
             parsing,
             emotions,
             colors,
         ) = get_analyzers()
         results = {}
-        # Step 1: MediaPipe Landmarks → geometric features (~40 attributes)
         logger.info("Running landmark analysis...")
         landmark_results = landmarks.analyze(img_array)
         results.update(landmark_results)
-        # Step 2: FairFace → age, gender, race
         logger.info("Running demographic analysis...")
         demo_results = demographics.analyze(img_array)
         results.update(demo_results)
-        # Step 3: CelebA → 40 binary facial attributes
-        logger.info("Running attribute analysis...")
-        attr_results = attributes.analyze(img_array)
-        results.update(attr_results)
-        # Step 4: BiSeNet → pixel segmentation → hair length, wrinkles, spots
         logger.info("Running face parsing...")
         parse_results = parsing.analyze(img_array)
         results.update(parse_results)
-        # Step 5: HSEmotion → emotion classification
         logger.info("Running emotion analysis...")
         emo_results = emotions.analyze(img_array)
         results.update(emo_results)
-        # Step 6: Color analysis using masks from Step 4 + landmarks from Step 1
         logger.info("Running color analysis...")
         color_results = colors.analyze(
             img_array,
@@ -192,6 +248,14 @@ async def analyze_face(file: UploadFile = File(...)):
         )
         results.update(color_results)
         # Remove internal fields (prefixed with underscore)
         results = {k: v for k, v in results.items() if not k.startswith("_")}
@@ -204,9 +268,11 @@ async def analyze_face(file: UploadFile = File(...)):
 @app.post("/analyze-base64")
 async def analyze_face_base64(body: dict):
-    """
-    Alternative endpoint that accepts base64-encoded image data.
-    This matches the format the Express server sends.
     """
     import base64
@@ -222,28 +288,28 @@ async def analyze_face_base64(body: dict):
         image_bytes = base64.b64decode(image_b64)
         image = Image.open(io.BytesIO(image_bytes)).convert("RGB")
         img_array = np.array(image)
-        img_bgr = cv2.cvtColor(img_array, cv2.COLOR_RGB2BGR)
         (
             landmarks,
             demographics,
-            attributes,
             parsing,
             emotions,
             colors,
         ) = get_analyzers()
         results = {}
         landmark_results = landmarks.analyze(img_array)
         results.update(landmark_results)
         demo_results = demographics.analyze(img_array)
         results.update(demo_results)
-        attr_results = attributes.analyze(img_array)
-        results.update(attr_results)
         parse_results = parsing.analyze(img_array)
         results.update(parse_results)
@@ -258,6 +324,11 @@ async def analyze_face_base64(body: dict):
         )
         results.update(color_results)
         results = {k: v for k, v in results.items() if not k.startswith("_")}
         return {"success": True, "data": _to_json_safe(results)}

 """
+HCP Face Analysis Microservice
+==============================
+FastAPI service that runs seven specialized analyzers over a single photo
+and merges their outputs into one ~100-field facial-attribute dictionary.
+Pipeline (in execution order)
+-----------------------------
+1.  MediaPipe Face Landmarker   478 3D landmarks + 52 ARKit blendshapes.
+                                Produces all geometric face/eye/nose/lip/
+                                jaw features plus smiling and mouth-open.
+2.  DemographicAnalyzer         Three ViT classifiers (FairFace age,
+                                FairFace gender, Ethnicity_Test_v003).
+                                Age is reported as a softmax-weighted
+                                continuous estimate, not a bucket midpoint.
+3.  ParsingAnalyzer             SegFormer-B5 human parsing. Emits face
+                                and hair pixel masks plus hair length,
+                                hat detection, and skin texture/wrinkle/
+                                freckle/uniformity stats computed via
+                                OpenCV over the face mask.
+4.  EmotionAnalyzer             HSEmotion EfficientNet-B0 8-class output
+                                plus derived valence, arousal, mood.
+5.  ColorAnalyzer               Pixel-level LAB/HSV statistics. Reads
+                                masks from step 3 and lip/iris landmarks
+                                from step 1. No ML model.
+6.  ObstructionAnalyzer         dima806 ViT-B/16. Glasses, sunglasses,
+                                mask flags with ~99% precision/recall.
+7.  HairTypeAnalyzer            dima806 ViT-B/16. Curly/dreadlocks/kinky/
+                                straight/wavy at ~93% accuracy.
+Endpoints
+---------
+GET  /                  service banner
+GET  /health            liveness check
+POST /analyze           multipart file upload
+POST /analyze-base64    JSON {"image": "<base64>"}
+Both POST endpoints run the same pipeline. All analyzers are lazily
+instantiated on first request to keep cold-start latency manageable
+on the Hugging Face Spaces free tier.
 """
 import os
+# hf_transfer gives much faster model downloads from the HF Hub on first
+# inference. HF_HUB_DOWNLOAD_TIMEOUT defaults to 10s which is too short
+# for the larger ViT checkpoints on a cold start.
 os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"
+os.environ["HF_HUB_DOWNLOAD_TIMEOUT"] = "60"
 import io
 import logging
 from typing import Optional
 import numpy as np
 from fastapi import FastAPI, File, HTTPException, UploadFile
 from fastapi.middleware.cors import CORSMiddleware
 from analyzers.landmark_analyzer import LandmarkAnalyzer
 from analyzers.demographic_analyzer import DemographicAnalyzer
 from analyzers.parsing_analyzer import ParsingAnalyzer
 from analyzers.emotion_analyzer import EmotionAnalyzer
 from analyzers.color_analyzer import ColorAnalyzer
+from analyzers.obstruction_analyzer import ObstructionAnalyzer
+from analyzers.hair_type_analyzer import HairTypeAnalyzer
 logging.basicConfig(level=logging.INFO)
 logger = logging.getLogger(__name__)
 # Analyzers are initialized lazily on first request to reduce cold-start time
 landmark_analyzer: Optional[LandmarkAnalyzer] = None
 demographic_analyzer: Optional[DemographicAnalyzer] = None
 parsing_analyzer: Optional[ParsingAnalyzer] = None
 emotion_analyzer: Optional[EmotionAnalyzer] = None
 color_analyzer: Optional[ColorAnalyzer] = None
+obstruction_analyzer: Optional[ObstructionAnalyzer] = None
+hair_type_analyzer: Optional[HairTypeAnalyzer] = None
 def _to_json_safe(value):
+    """Recursively coerce numpy scalars/arrays into JSON-serialisable types.
+    Several analyzers return numpy floats/booleans (e.g. from `np.std`
+    or boolean mask logic). FastAPI's default JSON encoder doesn't
+    handle those, so we normalise everything here before returning.
+    """
+    # Numpy first — these checks would otherwise be caught by isinstance
+    # for dict/list because numpy.generic types are duck-typed.
     if isinstance(value, (np.ndarray,)):
         return value.tolist()
     if isinstance(value, (np.integer, np.floating)):
         return bool(value)
     if isinstance(value, np.generic):
         return value.item()
+    # Recurse into nested containers.
     if isinstance(value, dict):
         return {str(k): _to_json_safe(v) for k, v in value.items()}
     if isinstance(value, (list, tuple, set)):
 def get_analyzers():
+    """Lazy-load all analyzer models on first use.
+    Each analyzer is instantiated once per process and reused across
+    requests. First request pays the full model-load cost; subsequent
+    requests are warm.
+    """
+    global landmark_analyzer, demographic_analyzer
     global parsing_analyzer, emotion_analyzer, color_analyzer
+    global obstruction_analyzer, hair_type_analyzer
     if landmark_analyzer is None:
         logger.info("Loading MediaPipe Face Landmarker...")
         logger.info("Loading FairFace demographics model...")
         demographic_analyzer = DemographicAnalyzer()
     if parsing_analyzer is None:
+        logger.info("Loading SegFormer face parser...")
         parsing_analyzer = ParsingAnalyzer()
     if emotion_analyzer is None:
     if color_analyzer is None:
         color_analyzer = ColorAnalyzer()
+    if obstruction_analyzer is None:
+        logger.info("Loading face obstruction classifier...")
+        obstruction_analyzer = ObstructionAnalyzer()
+    if hair_type_analyzer is None:
+        logger.info("Loading hair type classifier...")
+        hair_type_analyzer = HairTypeAnalyzer()
     return (
         landmark_analyzer,
         demographic_analyzer,
         parsing_analyzer,
         emotion_analyzer,
         color_analyzer,
+        obstruction_analyzer,
+        hair_type_analyzer,
     )
 @app.get("/")
 async def root():
+    """Service banner — confirms the server is reachable and which version."""
     return {
         "name": "HCP Face Analysis Service",
         "version": "2.0.0",
         "status": "running",
         "endpoints": {
             "health": "/health",
+            "analyze": "/analyze",
+            "analyze-base64": "/analyze-base64",
         }
     }
 @app.get("/health")
 async def health():
+    """Liveness probe. Used by the Express server and HF Spaces uptime checks."""
     return {"status": "ok"}
 @app.post("/analyze")
 async def analyze_face(file: UploadFile = File(...)):
+    """Multipart endpoint for direct uploads.
+    Runs all seven analyzers and returns the merged attribute dict.
+    See `analyze_face_base64` for the JSON-body variant the Express
+    server calls.
     """
     try:
+        # Decode the upload into an RGB numpy array. All analyzers
+        # work in RGB; we don't actually need BGR but keeping it as a
+        # local in case a future analyzer wants the OpenCV-native order.
         contents = await file.read()
         image = Image.open(io.BytesIO(contents)).convert("RGB")
         img_array = np.array(image)
         (
             landmarks,
             demographics,
             parsing,
             emotions,
             colors,
+            obstructions,
+            hair_types,
         ) = get_analyzers()
         results = {}
+        # Step 1: MediaPipe Landmarks → all geometric features + blendshapes.
         logger.info("Running landmark analysis...")
         landmark_results = landmarks.analyze(img_array)
         results.update(landmark_results)
+        # Step 2: FairFace + Ethnicity ViT → demographics.
         logger.info("Running demographic analysis...")
         demo_results = demographics.analyze(img_array)
         results.update(demo_results)
+        # Step 3: SegFormer-B5 human parsing → masks + hair length + skin stats.
         logger.info("Running face parsing...")
         parse_results = parsing.analyze(img_array)
         results.update(parse_results)
+        # Step 4: HSEmotion → 8-class emotion + valence/arousal/mood.
         logger.info("Running emotion analysis...")
         emo_results = emotions.analyze(img_array)
         results.update(emo_results)
+        # Step 5: Pixel color analysis. Uses the face/hair masks from step 3
+        # and MediaPipe lip/iris landmarks from step 1.
         logger.info("Running color analysis...")
         color_results = colors.analyze(
             img_array,
         )
         results.update(color_results)
+        # Step 6: ObstructionViT → glasses / sunglasses / mask flags.
+        logger.info("Running obstruction analysis...")
+        results.update(obstructions.analyze(img_array))
+        # Step 7: HairTypeViT → curly/dreadlocks/kinky/straight/wavy.
+        logger.info("Running hair-type analysis...")
+        results.update(hair_types.analyze(img_array))
         # Remove internal fields (prefixed with underscore)
         results = {k: v for k, v in results.items() if not k.startswith("_")}
 @app.post("/analyze-base64")
 async def analyze_face_base64(body: dict):
+    """JSON-body endpoint accepting `{"image": "<base64>"}`.
+    This is what the Node/Express server forwards client requests to
+    so we don't have to push multipart payloads through the proxy.
+    The pipeline body is identical to `/analyze`.
     """
     import base64
         image_bytes = base64.b64decode(image_b64)
         image = Image.open(io.BytesIO(image_bytes)).convert("RGB")
         img_array = np.array(image)
         (
             landmarks,
             demographics,
             parsing,
             emotions,
             colors,
+            obstructions,
+            hair_types,
         ) = get_analyzers()
         results = {}
+        # Same seven-step pipeline as /analyze. Kept inline (rather
+        # than factored out) so the per-step `logger.info` cadence and
+        # ordering stay obvious when reading either endpoint top-down.
         landmark_results = landmarks.analyze(img_array)
         results.update(landmark_results)
         demo_results = demographics.analyze(img_array)
         results.update(demo_results)
         parse_results = parsing.analyze(img_array)
         results.update(parse_results)
         )
         results.update(color_results)
+        results.update(obstructions.analyze(img_array))
+        results.update(hair_types.analyze(img_array))
+        # Drop internal/scratch fields (leading underscore) before
+        # returning. Keeps masks and raw landmark lists out of the JSON.
         results = {k: v for k, v in results.items() if not k.startswith("_")}
         return {"success": True, "data": _to_json_safe(results)}

architecture.md CHANGED Viewed

@@ -1,1707 +1,99 @@
-# HCP Face Analysis — Architecture Plan
-## Revised Architecture & Best Models for Maximum Feature Coverage
-Since the codebase is flexible and can use more languages and frameworks, we go beyond the Supabase Edge Function constraint to find the **absolute best models** for the full feature list.
----
-## Recommended Architecture: Python Microservice Sidecar
-```
-┌──────────────────────────────────────────────────────────┐
-│                     CURRENT STACK                        │
-│  Next.js Frontend ──► Supabase (Auth, DB, Storage)       │
-└──────────────┬───────────────────────────────────────────┘
-               │
-               ▼
-┌──────────────────────────────────────────────────────────┐
-│          NEW: Python Face Analysis Microservice          │
-│  (Railway.app / Render.com / Hugging Face Spaces)        │
-│  FREE TIER: 512MB RAM, shared CPU                        │
-│                                                          │
-│  FastAPI Server                                          │
-│  ├── MediaPipe Face Landmarker (478 landmarks, 4MB)      │
-│  ├── InsightFace Buffalo_SC (recognition + attrs, 30MB)  │
-│  ├── FairFace (age/gender/race, 90MB)                    │
-│  ├── HuggingFace ViT models (attributes, ~50MB each)     │
-│  ├── BiSeNet (face parsing/segmentation, 50MB)           │
-│  └── Custom geometric analysis (your feature list)       │
-│                                                          │
-│  Total: ~250MB models (loaded lazily)                    │
-└──────────────────────────────────────────────────────────┘
-```
-**Why this is better:** Python gives access to the **entire deep learning ecosystem** — every model on HuggingFace, every research paper's pretrained weights. Free-tier hosting on Railway/Render gives 512MB RAM and enough CPU for per-request inference.
----
-## Best Models Per Feature Category
-### Tier 1: Core Models (Must Have)
-#### 1. MediaPipe Face Landmarker — Geometric Features
-- **478 3D landmarks + 52 blendshapes**
-- **Size:** 4MB
-- **Covers:** Face shape, jawline, chin, cheekbones, forehead, eye shape, eye spacing, eye size, eyebrow shape, nose shape, lip shape, mouth width, dimples, facial asymmetry
-- **GitHub:** https://github.com/google-ai-edge/mediapipe
-- **Python:** `pip install mediapipe`
-- **Accuracy:** State-of-the-art landmark detection, handles 30° head rotation well
-#### 2. InsightFace Buffalo_SC — Lightweight Recognition + Age/Gender
-- **Size:** ~30MB (smallest Buffalo variant)
-- **LFW Accuracy:** 99.5%
-- **Covers:** Face detection, age, gender, face embedding (for recognition), 2D landmarks
-- **GitHub:** https://github.com/deepinsight/insightface
-- **Weights:** Auto-downloaded via `insightface.app.FaceAnalysis(name='buffalo_sc')`
-- **Why not Buffalo_L:** 320MB is overkill; Buffalo_SC is 90% as accurate at 1/10th the size
-#### 3. FairFace — Age, Gender, Race (Most Accurate)
-- **Size:** ~90MB (ResNet-34)
-- **Accuracy:** 93.4% race, 94.2% gender, MAE 3.4 years for age
-- **Covers:** Age (9 buckets), gender, race (7 categories: White, Black, Latino, East Asian, Southeast Asian, Indian, Middle Eastern)
-- **GitHub:** https://github.com/dchen236/FairFace
-- **Weights:** https://drive.google.com/file/d/1xSfJQWMhm3AVlJYcPcabGO_bj1kDB0xw (res34_fair_align_multi_7_20190809.pt)
-- **Why over InsightFace for this:** FairFace is specifically trained for fair demographic classification across races, not biased toward any group
-#### 4. HSEmotion (EfficientNet) — Emotion Recognition
-- **Size:** ~20MB
-- **Accuracy:** 66.5% on AffectNet-8 (state-of-the-art), 8 emotions
-- **Covers:** Angry, contempt, disgust, fear, happy, neutral, sad, surprise
-- **GitHub:** https://github.com/HSE-asavchenko/face-emotion-recognition
-- **Weights:** Available via `timm` or direct download from repo
-- **Why over face-api.js:** Significantly more accurate, trained on AffectNet (largest emotion dataset)
-### Tier 2: Specialized Models
-#### 5. BiSeNet Face Parsing — Facial Segmentation
-- **Size:** ~50MB
-- **Covers:** Skin region, left/right eyebrow, left/right eye, nose, upper/lower lip, inner mouth, hair, left/right ear, neck, cloth, hat, earrings, glasses, background
-- **GitHub:** https://github.com/zllrunning/face-parsing.PyTorch
-- **Weights:** https://drive.google.com/file/d/154JgKpzCPW82qINcVieuPH3fZ2e0P812
-- **Why this matters:** Precisely segments hair, skin, eyebrows for color analysis, facial hair detection, glasses detection, and wrinkle analysis
-#### 6. microsoft/swin-base-patch4-window7-224-in22k fine-tuned for facial attributes
-- **HuggingFace:** Various CelebA-trained attribute classifiers
-- Specifically: https://huggingface.co/nateraw/vit-age-classifier (age)
-- Specifically: https://huggingface.co/rizvandwiki/gender-classification-2 (gender)
-#### 7. CelebA Attribute Classifier (Custom Multi-Label)
-- **Dataset:** CelebA has 40 binary attributes already labeled
-- Train a lightweight EfficientNet-B0 (~20MB) on CelebA for:
-  - `Attractive`, `Bald`, `Bangs`, `Big_Lips`, `Big_Nose`, `Black_Hair`, `Blond_Hair`, `Brown_Hair`, `Bushy_Eyebrows`, `Chubby`, `Double_Chin`, `Eyeglasses`, `Goatee`, `Gray_Hair`, `Heavy_Makeup`, `High_Cheekbones`, `Male`, `Mouth_Slightly_Open`, `Mustache`, `Narrow_Eyes`, `No_Beard`, `Oval_Face`, `Pointy_Nose`, `Receding_Hairline`, `Sideburns`, `Smiling`, `Straight_Hair`, `Wavy_Hair`, `Wearing_Hat`, `Young`
-- **Pre-trained option:** https://github.com/dchen236/FairFace has CelebA-trained models
-- **Better pre-trained option:** https://huggingface.co/jnferreira/attribute-prediction-celebA
-#### 8. Hair Segmentation + Color Analysis
-- **Model:** MODNet for matting + BiSeNet for hair segmentation
-- **GitHub (MODNet):** https://github.com/ZHKKKe/MODNet (~25MB)
-- Post-segmentation: K-means clustering on hair pixels for color
-#### 9. Skin Analysis (Wrinkles, Acne, etc.)
-- **Model:** https://huggingface.co/imfarzanansari/skin-disease-detection (for acne/skin conditions)
-- **For wrinkles:** Edge detection (Canny/Sobel) on forehead/eye regions from BiSeNet parsing — no model needed
-- **For freckles/moles:** Blob detection on skin regions from BiSeNet parsing
----
-## Complete Feature Coverage Map
-| Feature | Model/Method | Confidence |
-|---------|-------------|------------|
-| **Face shape** (oval, round, square, heart, diamond, oblong, triangle) | MediaPipe landmarks geometric ratios + CelebA (`Oval_Face`) | ⭐⭐⭐⭐ |
-| **Jawline** (sharp, soft, strong) | MediaPipe jaw landmark angles | ⭐⭐⭐⭐ |
-| **Chin** (receding, pointed, cleft, wide) | MediaPipe chin landmarks + depth (z) | ⭐⭐⭐ |
-| **Cheekbones** (high, flat, full, hollow) | MediaPipe landmark z-depth + CelebA (`High_Cheekbones`, `Chubby`) | ⭐⭐⭐⭐ |
-| **Forehead** (broad, narrow) | MediaPipe forehead span ratio | ⭐⭐⭐⭐ |
-| **Eye shape** (almond, round, hooded, monolid, upturned, downturned) | MediaPipe eyelid curvature + corner angles | ⭐⭐⭐⭐ |
-| **Eye spacing** (wide-set, close-set) | MediaPipe interpupillary distance ratio | ⭐⭐⭐⭐⭐ |
-| **Eye size** (large, small) | MediaPipe eye area / face area | ⭐⭐⭐⭐⭐ |
-| **Deep-set / protruding eyes** | MediaPipe landmark z-depth at eye region | ⭐⭐⭐ |
-| **Eye color** (brown, blue, green, hazel) | Iris crop → HSV color histogram + KNN | ⭐⭐⭐⭐ |
-| **Dark under-eyes / eye bags** | BiSeNet skin parsing → brightness analysis under eyes | ⭐⭐⭐ |
-| **Crow's feet** | Canny edge detection on BiSeNet-parsed outer eye skin | ⭐⭐⭐ |
-| **Eyebrow shape** (arched, straight, bushy, thick, thin) | MediaPipe brow landmarks + CelebA (`Bushy_Eyebrows`, `Arched_Eyebrows`) | ⭐⭐⭐⭐ |
-| **Unibrow** | MediaPipe inner brow distance + pixel analysis between brows | ⭐⭐⭐⭐ |
-| **Nose shape** (straight, aquiline, button, upturned, wide, narrow) | MediaPipe nose landmarks + CelebA (`Big_Nose`, `Pointy_Nose`) | ⭐⭐⭐⭐ |
-| **Nose bridge** (flat, high) | MediaPipe z-depth at nasal bridge | ⭐⭐⭐ |
-| **Nostrils** (wide, narrow) | MediaPipe nostril landmark width ratio | ⭐⭐⭐⭐ |
-| **Lips** (full, thin) | MediaPipe lip landmarks + CelebA (`Big_Lips`) | ⭐⭐⭐⭐ |
-| **Mouth width** | MediaPipe mouth corner distance ratio | ⭐⭐⭐⭐⭐ |
-| **Cupid's bow** | MediaPipe upper lip curvature analysis | ⭐⭐⭐ |
-| **Teeth** (gap, crooked, straight, overbite, underbite) | Mouth crop when smiling → custom classifier or rule-based | ⭐⭐ |
-| **Dimples** | MediaPipe blendshapes during smile + cheek region analysis | ⭐⭐⭐ |
-| **Smile lines** | Edge detection on nasolabial region | ⭐⭐⭐ |
-| **Asymmetrical smile** | MediaPipe left/right smile blendshape difference | ⭐⭐⭐⭐ |
-| **Hair type** (straight, wavy, curly, coily) | BiSeNet hair segmentation → texture frequency (FFT) + CelebA (`Straight_Hair`, `Wavy_Hair`) | ⭐⭐⭐ |
-| **Hair length** (short, long, bald) | BiSeNet hair mask area + CelebA (`Bald`, `Bangs`) | ⭐⭐⭐⭐ |
-| **Hair color** (black, brown, blonde, red, gray, dyed) | BiSeNet hair mask → K-means color clustering + CelebA (`Black_Hair`, `Brown_Hair`, `Blond_Hair`, `Gray_Hair`) | ⭐⭐⭐⭐ |
-| **Receding hairline / widow's peak** | BiSeNet hair boundary analysis + CelebA (`Receding_Hairline`) | ⭐⭐⭐ |
-| **Beard/facial hair** (full, stubble, goatee, mustache, sideburns, clean-shaven) | BiSeNet parsing lower face + CelebA (`5_o_Clock_Shadow`, `Goatee`, `Mustache`, `No_Beard`, `Sideburns`) | ⭐⭐⭐⭐ |
-| **Skin tone** (light, medium, dark) | BiSeNet skin parsing → mean LAB brightness | ⭐⭐⭐⭐⭐ |
-| **Freckles** | BiSeNet skin mask → small blob detection (contrast) | ⭐⭐⭐ |
-| **Moles / birthmark** | BiSeNet skin mask → dark blob detection | ⭐⭐⭐ |
-| **Scars** | BiSeNet skin mask → linear edge anomaly detection | ⭐⭐ |
-| **Acne** | BiSeNet skin mask → red blob detection or HuggingFace skin model | ⭐⭐⭐ |
-| **Wrinkles / forehead lines** | BiSeNet forehead mask → Gabor filter or Canny edges | ⭐⭐⭐ |
-| **Facial asymmetry** | MediaPipe left/right landmark mirror distance | ⭐⭐⭐⭐⭐ |
-| **Prominent Adam's apple** | Neck region detection (limited accuracy) | ⭐ |
-| **Glasses** | CelebA (`Eyeglasses`) + BiSeNet parsing | ⭐⭐⭐⭐⭐ |
-| **Age** | FairFace (MAE 3.4 years) | ⭐⭐⭐⭐⭐ |
-| **Gender** | FairFace (94.2%) | ⭐⭐⭐⭐⭐ |
-| **Race** | FairFace (93.4%, 7 categories) | ⭐⭐⭐⭐⭐ |
-| **Emotion** | HSEmotion (66.5% AffectNet-8, SOTA) | ⭐⭐⭐⭐ |
----
-## Model Comparison Table
-| Model | Accuracy (LFW) | Size | Runs in Deno/Browser? | Feature Depth | Notes |
-|-------|----------------|------|----------------------|---------------|-------|
-| **DeepFace** (Python) | 97.4% (VGG-Face) | 500MB+ | ❌ No (Python only) | Age, gender, race, emotion | Too large, wrong runtime |
-| **InsightFace Buffalo_L** | 99.8% (LFW) | ~320MB | ❌ No (Python/C++) | Landmarks, age, gender | Too large for edge |
-| **InsightFace MobileFaceNet** | 99.5% (LFW) | ~4MB | ⚠️ ONNX possible | Recognition only, no attributes | Very small but limited features |
-| **MediaPipe Face Landmarker** | N/A (landmark model) | ~4MB | ✅ Yes (TFJS/WASM) | 478 landmarks, blendshapes | Best for geometric features |
-| **face-api.js** | 99.2% (LFW) | ~6MB (all models) | ✅ Yes (TFJS) | Age, gender, emotion, 68 landmarks | Browser/Node.js ready |
-| **ONNX FER+ (emotion)** | ~85% (FER2013) | ~2MB | ✅ Yes (ONNX.js) | Emotion only | Supplement model |
-| **HuggingFace ViT models** | Varies | 50-350MB | ⚠️ ONNX export possible | Age, gender, various classifiers | Some fit under 50MB |
----
-## Free Hosting Options for the Python Microservice
-| Platform | Free Tier | RAM | Cold Start | Best For |
-|----------|-----------|-----|------------|----------|
-| **Hugging Face Spaces** | Unlimited | 2GB CPU | ~15s | Best free option, runs Gradio/FastAPI |
-| **Railway.app** | $5 credit/month | 512MB | ~5s | Good for always-on API |
-| **Render.com** | 750 hrs/month | 512MB | ~30s | Spins down after 15min inactivity |
-| **Google Cloud Run** | 2M requests/month | 512MB | ~10s | Best scaling, pay-per-request |
-| **Fly.io** | 3 shared VMs | 256MB | ~3s | Low latency, always on |
-**Recommendation: Hugging Face Spaces** — 2GB RAM free, pre-installed ML libraries, no cold start limits, and you can use their Inference API for some models without even hosting.
----
-## Full Implementation
-### Python Microservice
-#### requirements.txt
-```
-fastapi==0.115.0
-uvicorn==0.30.0
-python-multipart==0.0.9
-mediapipe==0.10.14
-insightface==0.7.3
-onnxruntime==1.18.0
-torch==2.3.0
-torchvision==0.18.0
-Pillow==10.4.0
-numpy==1.26.4
-opencv-python-headless==4.10.0.84
-scipy==1.13.0
-scikit-learn==1.5.0
-huggingface-hub==0.23.0
-```
-#### face-service/app.py
-```python
-"""
-Face Analysis Microservice
-Combines multiple models for comprehensive facial attribute detection.
-"""
-import io
-import logging
-from typing import Optional
-import cv2
-import numpy as np
-from fastapi import FastAPI, File, HTTPException, UploadFile
-from fastapi.middleware.cors import CORSMiddleware
-from PIL import Image
-from analyzers.landmark_analyzer import LandmarkAnalyzer
-from analyzers.demographic_analyzer import DemographicAnalyzer
-from analyzers.attribute_analyzer import AttributeAnalyzer
-from analyzers.parsing_analyzer import ParsingAnalyzer
-from analyzers.emotion_analyzer import EmotionAnalyzer
-from analyzers.color_analyzer import ColorAnalyzer
-logging.basicConfig(level=logging.INFO)
-logger = logging.getLogger(__name__)
-app = FastAPI(title="Face Analysis Service", version="2.0.0")
-app.add_middleware(
-    CORSMiddleware,
-    allow_origins=["*"],  # Restrict in production
-    allow_credentials=True,
-    allow_methods=["*"],
-    allow_headers=["*"],
-)
-# Initialize analyzers lazily
-landmark_analyzer: Optional[LandmarkAnalyzer] = None
-demographic_analyzer: Optional[DemographicAnalyzer] = None
-attribute_analyzer: Optional[AttributeAnalyzer] = None
-parsing_analyzer: Optional[ParsingAnalyzer] = None
-emotion_analyzer: Optional[EmotionAnalyzer] = None
-color_analyzer: Optional[ColorAnalyzer] = None
-def get_analyzers():
-    global landmark_analyzer, demographic_analyzer, attribute_analyzer
-    global parsing_analyzer, emotion_analyzer, color_analyzer
-    if landmark_analyzer is None:
-        logger.info("Loading MediaPipe landmarks...")
-        landmark_analyzer = LandmarkAnalyzer()
-    if demographic_analyzer is None:
-        logger.info("Loading FairFace demographics...")
-        demographic_analyzer = DemographicAnalyzer()
-    if attribute_analyzer is None:
-        logger.info("Loading CelebA attribute classifier...")
-        attribute_analyzer = AttributeAnalyzer()
-    if parsing_analyzer is None:
-        logger.info("Loading BiSeNet face parser...")
-        parsing_analyzer = ParsingAnalyzer()
-    if emotion_analyzer is None:
-        logger.info("Loading HSEmotion...")
-        emotion_analyzer = EmotionAnalyzer()
-    if color_analyzer is None:
-        color_analyzer = ColorAnalyzer()
-    return (
-        landmark_analyzer,
-        demographic_analyzer,
-        attribute_analyzer,
-        parsing_analyzer,
-        emotion_analyzer,
-        color_analyzer,
-    )
-@app.get("/health")
-async def health():
-    return {"status": "ok"}
-@app.post("/analyze")
-async def analyze_face(file: UploadFile = File(...)):
-    """Comprehensive face analysis endpoint."""
-    try:
-        contents = await file.read()
-        image = Image.open(io.BytesIO(contents)).convert("RGB")
-        img_array = np.array(image)
-        img_bgr = cv2.cvtColor(img_array, cv2.COLOR_RGB2BGR)
-        (
-            landmarks,
-            demographics,
-            attributes,
-            parsing,
-            emotions,
-            colors,
-        ) = get_analyzers()
-        results = {}
-        # 1. MediaPipe Landmarks → geometric features
-        logger.info("Running landmark analysis...")
-        landmark_results = landmarks.analyze(img_array)
-        results.update(landmark_results)
-        # 2. FairFace → age, gender, race
-        logger.info("Running demographic analysis...")
-        demo_results = demographics.analyze(img_array)
-        results.update(demo_results)
-        # 3. CelebA attributes → 40 binary facial attributes
-        logger.info("Running attribute analysis...")
-        attr_results = attributes.analyze(img_array)
-        results.update(attr_results)
-        # 4. BiSeNet face parsing → segmentation masks
-        logger.info("Running face parsing...")
-        parse_results = parsing.analyze(img_bgr)
-        results.update(parse_results)
-        # 5. HSEmotion → emotion classification
-        logger.info("Running emotion analysis...")
-        emo_results = emotions.analyze(img_array)
-        results.update(emo_results)
-        # 6. Color analysis using parsing masks
-        logger.info("Running color analysis...")
-        color_results = colors.analyze(
-            img_array,
-            skin_mask=parse_results.get("_skin_mask"),
-            hair_mask=parse_results.get("_hair_mask"),
-            landmark_data=landmark_results.get("_raw_landmarks"),
-        )
-        results.update(color_results)
-        # Remove internal fields
-        results = {k: v for k, v in results.items() if not k.startswith("_")}
-        return {"success": True, "data": results}
-    except Exception as e:
-        logger.error(f"Analysis failed: {e}", exc_info=True)
-        raise HTTPException(status_code=500, detail=str(e))
-```
-#### face-service/analyzers/landmark_analyzer.py
-```python
-"""
-MediaPipe Face Landmarker — 478 3D landmarks + 52 blendshapes
-Derives geometric facial features from landmark positions.
-"""
-import math
-from typing import Any
-import mediapipe as mp
-import numpy as np
-from mediapipe.tasks import python as mp_python
-from mediapipe.tasks.python import vision
-class LandmarkAnalyzer:
-    def __init__(self):
-        base_options = mp_python.BaseOptions(
-            model_asset_path=self._download_model()
-        )
-        options = vision.FaceLandmarkerOptions(
-            base_options=base_options,
-            output_face_blendshapes=True,
-            output_facial_transformation_matrixes=True,
-            num_faces=1,
-        )
-        self.detector = vision.FaceLandmarker.create_from_options(options)
-    def _download_model(self) -> str:
-        import urllib.request
-        import os
-        model_path = "models/face_landmarker.task"
-        if not os.path.exists(model_path):
-            os.makedirs("models", exist_ok=True)
-            url = "https://storage.googleapis.com/mediapipe-models/face_landmarker/face_landmarker/float16/latest/face_landmarker.task"
-            urllib.request.urlretrieve(url, model_path)
-        return model_path
-    def analyze(self, img_rgb: np.ndarray) -> dict[str, Any]:
-        mp_image = mp.Image(image_format=mp.ImageFormat.SRGB, data=img_rgb)
-        result = self.detector.detect(mp_image)
-        if not result.face_landmarks:
-            return {"error": "No face detected by MediaPipe"}
-        landmarks = result.face_landmarks[0]
-        lm = [{"x": l.x, "y": l.y, "z": l.z} for l in landmarks]
-        blendshapes = {}
-        if result.face_blendshapes:
-            for bs in result.face_blendshapes[0]:
-                blendshapes[bs.category_name] = round(bs.score, 4)
-        attrs = {}
-        attrs["_raw_landmarks"] = lm
-        # === Face Shape ===
-        face_height = self._dist(lm[10], lm[152])
-        face_width = self._dist(lm[234], lm[454])
-        jaw_width = self._dist(lm[172], lm[397])
-        cheekbone_width = self._dist(lm[93], lm[323])
-        forehead_width = self._dist(lm[54], lm[284])
-        wh_ratio = face_width / face_height if face_height > 0 else 1
-        jaw_to_face = jaw_width / face_width if face_width > 0 else 1
-        forehead_to_jaw = forehead_width / jaw_width if jaw_width > 0 else 1
-        cheek_to_jaw = cheekbone_width / jaw_width if jaw_width > 0 else 1
-        if wh_ratio > 0.85 and jaw_to_face > 0.75:
-            attrs["face_shape"] = "round"
-        elif wh_ratio > 0.8 and jaw_to_face > 0.8 and forehead_to_jaw < 1.1:
-            attrs["face_shape"] = "square"
-        elif wh_ratio < 0.75:
-            attrs["face_shape"] = "oblong"
-        elif forehead_to_jaw > 1.3:
-            attrs["face_shape"] = "heart"
-        elif cheek_to_jaw > 1.25 and forehead_to_jaw < 1.15:
-            attrs["face_shape"] = "diamond"
-        elif forehead_to_jaw < 0.85:
-            attrs["face_shape"] = "triangle"
-        else:
-            attrs["face_shape"] = "oval"
-        attrs["face_shape_metrics"] = {
-            "width_height_ratio": round(wh_ratio, 3),
-            "jaw_to_face_ratio": round(jaw_to_face, 3),
-            "forehead_to_jaw_ratio": round(forehead_to_jaw, 3),
-            "cheekbone_to_jaw_ratio": round(cheek_to_jaw, 3),
-        }
-        # === Forehead ===
-        forehead_ratio = forehead_width / face_width if face_width > 0 else 0.6
-        attrs["forehead_width"] = (
-            "broad" if forehead_ratio > 0.7
-            else "narrow" if forehead_ratio < 0.55
-            else "average"
-        )
-        # === Jawline ===
-        jaw_angle = self._jaw_angle(lm)
-        attrs["jawline_angle"] = round(jaw_angle, 1)
-        if jaw_angle < 110:
-            attrs["jawline_type"] = "sharp"
-        elif jaw_angle > 140:
-            attrs["jawline_type"] = "soft"
-        elif jaw_to_face > 0.75:
-            attrs["jawline_type"] = "strong"
-        else:
-            attrs["jawline_type"] = "soft"
-        # === Chin ===
-        chin_width = self._dist(lm[175], lm[396])
-        chin_ratio = chin_width / jaw_width if jaw_width > 0 else 0.4
-        attrs["chin_type"] = (
-            "pointed" if chin_ratio < 0.3
-            else "wide" if chin_ratio > 0.5
-            else "normal"
-        )
-        # === Cheekbones ===
-        cheek_z = (lm[93]["z"] + lm[323]["z"]) / 2
-        attrs["cheekbone_prominence"] = (
-            "high" if cheek_z < -0.04
-            else "flat" if cheek_z > 0.0
-            else "moderate"
-        )
-        # Hollow vs full cheeks (blendshape-assisted)
-        cheek_puff = blendshapes.get("cheekPuff", 0)
-        cheek_squint_l = blendshapes.get("cheekSquintLeft", 0)
-        cheek_squint_r = blendshapes.get("cheekSquintRight", 0)
-        if cheek_puff > 0.3:
-            attrs["cheek_fullness"] = "full"
-        elif cheek_z > -0.01:
-            attrs["cheek_fullness"] = "hollow"
-        else:
-            attrs["cheek_fullness"] = "normal"
-        # === Eyes ===
-        left_eye_top = lm[159]
-        left_eye_bottom = lm[145]
-        left_eye_inner = lm[133]
-        left_eye_outer = lm[33]
-        eye_openness = self._dist(left_eye_top, left_eye_bottom)
-        eye_width_val = self._dist(left_eye_inner, left_eye_outer)
-        eye_ratio = eye_openness / eye_width_val if eye_width_val > 0 else 0.3
-        outer_angle = left_eye_outer["y"] - left_eye_inner["y"]
-        if outer_angle < -0.012:
-            attrs["eye_shape"] = "upturned"
-        elif outer_angle > 0.012:
-            attrs["eye_shape"] = "downturned"
-        elif eye_ratio > 0.38:
-            attrs["eye_shape"] = "round"
-        elif eye_ratio < 0.2:
-            attrs["eye_shape"] = "hooded"
-        else:
-            attrs["eye_shape"] = "almond"
-        # Deep-set vs protruding
-        eye_z = (lm[159]["z"] + lm[145]["z"]) / 2
-        nose_bridge_z = lm[6]["z"]
-        if eye_z > nose_bridge_z + 0.02:
-            attrs["eye_depth"] = "deep-set"
-        elif eye_z < nose_bridge_z - 0.01:
-            attrs["eye_depth"] = "protruding"
-        else:
-            attrs["eye_depth"] = "normal"
-        # Eye spacing
-        if len(lm) > 473:  # Iris landmarks available
-            inter_pupillary = self._dist(lm[468], lm[473])
-        else:
-            inter_pupillary = self._dist(lm[133], lm[362])
-        ip_ratio = inter_pupillary / face_width if face_width > 0 else 0.35
-        attrs["eye_spacing"] = (
-            "wide-set" if ip_ratio > 0.38
-            else "close-set" if ip_ratio < 0.28
-            else "average"
-        )
-        # Eye size
-        right_eye_top = lm[386]
-        right_eye_bottom = lm[374]
-        right_eye_inner = lm[362]
-        right_eye_outer = lm[263]
-        r_eye_area = self._dist(right_eye_top, right_eye_bottom) * self._dist(right_eye_inner, right_eye_outer)
-        l_eye_area = eye_openness * eye_width_val
-        avg_eye_area = (l_eye_area + r_eye_area) / 2
-        face_area = face_width * face_height
-        eye_size_ratio = avg_eye_area / face_area if face_area > 0 else 0.015
-        attrs["eye_size"] = (
-            "large" if eye_size_ratio > 0.02
-            else "small" if eye_size_ratio < 0.012
-            else "average"
-        )
-        # Eye blink (closed vs open)
-        blink_l = blendshapes.get("eyeBlinkLeft", 0)
-        blink_r = blendshapes.get("eyeBlinkRight", 0)
-        attrs["eyes_open"] = (blink_l + blink_r) / 2 < 0.5
-        # === Eyebrows ===
-        brow_mid_l = lm[105]
-        brow_outer_l = lm[46]
-        brow_inner_l = lm[70]
-        brow_to_eye = self._dist(brow_mid_l, lm[159])
-        brow_arch_ratio = brow_to_eye / eye_openness if eye_openness > 0 else 1.5
-        attrs["eyebrow_arch_height"] = (
-            "high" if brow_arch_ratio > 2.2
-            else "low" if brow_arch_ratio < 1.3
-            else "average"
-        )
-        # Brow curvature
-        mid_y = brow_mid_l["y"]
-        avg_end_y = (brow_inner_l["y"] + brow_outer_l["y"]) / 2
-        curvature = mid_y - avg_end_y
-        if abs(curvature) < 0.003:
-            attrs["eyebrow_shape"] = "straight"
-        elif curvature < -0.008:
-            attrs["eyebrow_shape"] = "arched"
-        else:
-            attrs["eyebrow_shape"] = "flat"
-        # Eyebrow thickness (vertical span of brow landmarks)
-        brow_top = lm[66]  # Top of left brow
-        brow_bottom = lm[105]  # Bottom of left brow
-        brow_thickness = self._dist(brow_top, brow_bottom)
-        attrs["eyebrow_thickness"] = (
-            "thick" if brow_thickness > 0.015
-            else "thin" if brow_thickness < 0.008
-            else "medium"
-        )
-        # Unibrow detection
-        inner_brow_dist = self._dist(lm[70], lm[300])
-        attrs["possible_unibrow"] = inner_brow_dist < 0.04
-        # === Nose ===
-        nose_bridge_top = lm[6]
-        nose_tip = lm[1]
-        nose_bottom = lm[2]
-        left_nostril = lm[129]
-        right_nostril = lm[358]
-        nostril_w = self._dist(left_nostril, right_nostril)
-        nw_ratio = nostril_w / face_width if face_width > 0 else 0.24
-        attrs["nostril_width"] = (
-            "wide" if nw_ratio > 0.28
-            else "narrow" if nw_ratio < 0.2
-            else "average"
-        )
-        tip_angle = nose_tip["y"] - nose_bottom["y"]
-        if tip_angle < -0.005:
-            attrs["nose_shape"] = "upturned"
-        elif tip_angle > 0.01:
-            attrs["nose_shape"] = "aquiline"
-        elif nw_ratio > 0.28:
-            attrs["nose_shape"] = "wide"
-        elif nw_ratio < 0.2:
-            attrs["nose_shape"] = "narrow"
-        else:
-            attrs["nose_shape"] = "straight"
-        attrs["nose_bridge"] = (
-            "high" if nose_bridge_top["z"] < -0.05
-            else "flat" if nose_bridge_top["z"] > 0.0
-            else "average"
-        )
-        attrs["nose_tip_shape"] = (
-            "pointed" if nose_tip["z"] < nose_bottom["z"] - 0.01
-            else "rounded"
-        )
-        # === Lips & Mouth ===
-        upper_lip_top = lm[0]
-        upper_lip_bottom = lm[13]
-        lower_lip_top = lm[14]
-        lower_lip_bottom = lm[17]
-        mouth_left = lm[61]
-        mouth_right = lm[291]
-        upper_lip_h = self._dist(upper_lip_top, upper_lip_bottom)
-        lower_lip_h = self._dist(lower_lip_top, lower_lip_bottom)
-        total_lip_h = upper_lip_h + lower_lip_h
-        mouth_w = self._dist(mouth_left, mouth_right)
-        lip_ratio = total_lip_h / mouth_w if mouth_w > 0 else 0.3
-        attrs["lip_fullness"] = (
-            "full" if lip_ratio > 0.38
-            else "thin" if lip_ratio < 0.22
-            else "average"
-        )
-        attrs["lip_balance"] = (
-            "top-heavy" if upper_lip_h > lower_lip_h * 1.2
-            else "bottom-heavy" if lower_lip_h > upper_lip_h * 1.2
-            else "balanced"
-        )
-        mw_ratio = mouth_w / face_width if face_width > 0 else 0.37
-        attrs["mouth_width"] = (
-            "wide" if mw_ratio > 0.42
-            else "small" if mw_ratio < 0.32
-            else "average"
-        )
-        # Cupid's bow
-        cupid_left = lm[37]
-        cupid_center = lm[0]
-        cupid_right = lm[267]
-        bow_depth = cupid_center["y"] - (cupid_left["y"] + cupid_right["y"]) / 2
-        attrs["cupids_bow"] = (
-            "defined" if bow_depth > 0.005
-            else "subtle" if bow_depth > 0.002
-            else "flat"
-        )
-        # Smile
-        smile_l = blendshapes.get("mouthSmileLeft", 0)
-        smile_r = blendshapes.get("mouthSmileRight", 0)
-        attrs["smiling"] = (smile_l + smile_r) / 2 > 0.4
-        attrs["smile_asymmetry"] = round(abs(smile_l - smile_r), 3)
-        # Dimples (heuristic: strong smile with low cheek puff)
-        attrs["possible_dimples"] = (
-            (smile_l > 0.5 or smile_r > 0.5) and cheek_puff < 0.2
-        )
-        # === Facial Asymmetry ===
-        symmetry_pairs = [
-            (33, 263), (133, 362), (70, 300), (93, 323), (172, 397),
-            (61, 291), (159, 386), (145, 374), (46, 276),
-        ]
-        asymmetry_sum = 0.0
-        for li, ri in symmetry_pairs:
-            left_dist = abs(lm[li]["x"] - 0.5)
-            right_dist = abs(lm[ri]["x"] - 0.5)
-            asymmetry_sum += abs(left_dist - right_dist)
-        attrs["facial_asymmetry_score"] = round(
-            min(asymmetry_sum / len(symmetry_pairs) / 0.05, 1.0), 3
-        )
-        # === Head Pose (from transformation matrix) ===
-        attrs["blendshapes"] = blendshapes
-        return attrs
-    def _dist(self, a: dict, b: dict) -> float:
-        return math.sqrt(
-            (a["x"] - b["x"]) ** 2
-            + (a["y"] - b["y"]) ** 2
-            + (a.get("z", 0) - b.get("z", 0)) ** 2
-        )
-    def _jaw_angle(self, lm: list[dict]) -> float:
-        chin = lm[152]
-        left_jaw = lm[172]
-        right_jaw = lm[397]
-        v1 = (left_jaw["x"] - chin["x"], left_jaw["y"] - chin["y"])
-        v2 = (right_jaw["x"] - chin["x"], right_jaw["y"] - chin["y"])
-        dot = v1[0] * v2[0] + v1[1] * v2[1]
-        mag1 = math.sqrt(v1[0] ** 2 + v1[1] ** 2)
-        mag2 = math.sqrt(v2[0] ** 2 + v2[1] ** 2)
-        if mag1 * mag2 == 0:
-            return 120.0
-        cos_angle = max(-1, min(1, dot / (mag1 * mag2)))
-        return math.acos(cos_angle) * (180 / math.pi)
-```
-#### face-service/analyzers/demographic_analyzer.py
-```python
-"""
-FairFace — Age, Gender, Race prediction
-Most fair and accurate demographic classifier.
-"""
-import os
-from typing import Any
-import cv2
-import numpy as np
-import torch
-import torchvision.transforms as transforms
-from huggingface_hub import hf_hub_download
-from PIL import Image
-from torchvision import models
-class DemographicAnalyzer:
-    """FairFace-based age, gender, race classifier."""
-    AGE_LABELS = [
-        "0-2", "3-9", "10-19", "20-29", "30-39", "40-49", "50-59", "60-69", "70+"
-    ]
-    GENDER_LABELS = ["Male", "Female"]
-    RACE_LABELS = [
-        "White", "Black", "Latino_Hispanic", "East Asian",
-        "Southeast Asian", "Indian", "Middle Eastern"
-    ]
-    def __init__(self):
-        self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
-        self.model = self._load_model()
-        self.transform = transforms.Compose([
-            transforms.Resize((224, 224)),
-            transforms.ToTensor(),
-            transforms.Normalize(
-                mean=[0.485, 0.456, 0.406],
-                std=[0.229, 0.224, 0.225],
-            ),
-        ])
-    def _load_model(self):
-        """Load FairFace ResNet34 model."""
-        model_path = "models/fairface_model.pt"
-        if not os.path.exists(model_path):
-            os.makedirs("models", exist_ok=True)
-            # Download from HuggingFace mirror or original source
-            # FairFace official weights: res34_fair_align_multi_7_20190809.pt
-            try:
-                hf_hub_download(
-                    repo_id="dchen236/FairFace",
-                    filename="res34_fair_align_multi_7_20190809.pt",
-                    local_dir="models",
-                    local_dir_use_symlinks=False,
-                )
-                os.rename(
-                    "models/res34_fair_align_multi_7_20190809.pt",
-                    model_path,
-                )
-            except Exception:
-                # Fallback: use a smaller pretrained model
-                raise FileNotFoundError(
-                    "Please download FairFace weights from "
-                    "https://github.com/dchen236/FairFace and place at models/fairface_model.pt"
-                )
-        model = models.resnet34(pretrained=False)
-        # FairFace has 3 output heads: race(7), gender(2), age(9) = 18
-        model.fc = torch.nn.Linear(model.fc.in_features, 18)
-        model.load_state_dict(torch.load(model_path, map_location=self.device))
-        model.to(self.device)
-        model.eval()
-        return model
-    def analyze(self, img_rgb: np.ndarray) -> dict[str, Any]:
-        """Predict age, gender, and race."""
-        pil_image = Image.fromarray(img_rgb)
-        input_tensor = self.transform(pil_image).unsqueeze(0).to(self.device)
-        with torch.no_grad():
-            outputs = self.model(input_tensor)
-        outputs = outputs.cpu().numpy()[0]
-        # Split outputs: race(0-6), gender(7-8), age(9-17)
-        race_logits = outputs[0:7]
-        gender_logits = outputs[7:9]
-        age_logits = outputs[9:18]
-        race_probs = self._softmax(race_logits)
-        gender_probs = self._softmax(gender_logits)
-        age_probs = self._softmax(age_logits)
-        race_idx = int(np.argmax(race_probs))
-        gender_idx = int(np.argmax(gender_probs))
-        age_idx = int(np.argmax(age_probs))
-        # Estimate numeric age from bucket
-        age_ranges = [(0, 2), (3, 9), (10, 19), (20, 29), (30, 39), (40, 49), (50, 59), (60, 69), (70, 85)]
-        age_estimate = sum(age_ranges[age_idx]) / 2
-        return {
-            "age_estimate": round(age_estimate, 1),
-            "age_range": self.AGE_LABELS[age_idx],
-            "age_confidence": round(float(age_probs[age_idx]), 3),
-            "gender": self.GENDER_LABELS[gender_idx].lower(),
-            "gender_confidence": round(float(gender_probs[gender_idx]), 3),
-            "race": self.RACE_LABELS[race_idx],
-            "race_confidence": round(float(race_probs[race_idx]), 3),
-            "race_probabilities": {
-                label: round(float(prob), 3)
-                for label, prob in zip(self.RACE_LABELS, race_probs)
-            },
-        }
-    @staticmethod
-    def _softmax(x: np.ndarray) -> np.ndarray:
-        e_x = np.exp(x - np.max(x))
-        return e_x / e_x.sum()
-```
-#### face-service/analyzers/attribute_analyzer.py
-```python
-"""
-CelebA Multi-Label Attribute Classifier
-Predicts 40 binary facial attributes from CelebA-trained model.
-Uses a pretrained model from HuggingFace.
-"""
-import os
-from typing import Any
-import numpy as np
-import torch
-import torchvision.transforms as transforms
-from PIL import Image
-CELEBA_ATTRIBUTES = [
-    "5_o_Clock_Shadow", "Arched_Eyebrows", "Attractive", "Bags_Under_Eyes",
-    "Bald", "Bangs", "Big_Lips", "Big_Nose", "Black_Hair", "Blond_Hair",
-    "Blurry", "Brown_Hair", "Bushy_Eyebrows", "Chubby", "Double_Chin",
-    "Eyeglasses", "Goatee", "Gray_Hair", "Heavy_Makeup", "High_Cheekbones",
-    "Male", "Mouth_Slightly_Open", "Mustache", "Narrow_Eyes", "No_Beard",
-    "Oval_Face", "Pale_Skin", "Pointy_Nose", "Receding_Hairline",
-    "Rosy_Cheeks", "Sideburns", "Smiling", "Straight_Hair", "Wavy_Hair",
-    "Wearing_Earrings", "Wearing_Hat", "Wearing_Lipstick", "Wearing_Necklace",
-    "Wearing_Necktie", "Young",
-]
-class AttributeAnalyzer:
-    """CelebA 40-attribute binary classifier using a fine-tuned ResNet."""
-    def __init__(self):
-        self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
-        self.model = self._load_model()
-        self.transform = transforms.Compose([
-            transforms.Resize((224, 224)),
-            transforms.ToTensor(),
-            transforms.Normalize(
-                mean=[0.485, 0.456, 0.406],
-                std=[0.229, 0.224, 0.225],
-            ),
-        ])
-    def _load_model(self):
-        """
-        Load a CelebA attribute prediction model.
-        Using a ResNet-18 fine-tuned on CelebA for 40 attributes.
-        """
-        from torchvision import models
-        model_path = "models/celeba_resnet18.pt"
-        if not os.path.exists(model_path):
-            os.makedirs("models", exist_ok=True)
-            # Try loading from HuggingFace
-            try:
-                from huggingface_hub import hf_hub_download
-                hf_hub_download(
-                    repo_id="jnferreira/attribute-prediction-celebA",
-                    filename="model.pt",
-                    local_dir="models",
-                    local_dir_use_symlinks=False,
-                )
-                os.rename("models/model.pt", model_path)
-            except Exception:
-                # Fallback: build a fresh model skeleton
-                # Users will need to train or provide weights
-                model = models.resnet18(pretrained=True)
-                model.fc = torch.nn.Linear(model.fc.in_features, 40)
-                torch.save(model.state_dict(), model_path)
-                print(
-                    "WARNING: Using ImageNet-pretrained ResNet18 without CelebA fine-tuning. "
-                    "Attribute predictions will be inaccurate. "
-                    "Please provide CelebA-trained weights at models/celeba_resnet18.pt"
-                )
-        model = models.resnet18(pretrained=False)
-        model.fc = torch.nn.Linear(model.fc.in_features, 40)
-        model.load_state_dict(
-            torch.load(model_path, map_location=self.device)
-        )
-        model.to(self.device)
-        model.eval()
-        return model
-    def analyze(self, img_rgb: np.ndarray) -> dict[str, Any]:
-        pil_image = Image.fromarray(img_rgb)
-        input_tensor = self.transform(pil_image).unsqueeze(0).to(self.device)
-        with torch.no_grad():
-            logits = self.model(input_tensor)
-        probs = torch.sigmoid(logits).cpu().numpy()[0]
-        # Build structured results
-        raw_attrs = {
-            attr: round(float(prob), 3)
-            for attr, prob in zip(CELEBA_ATTRIBUTES, probs)
-        }
-        # Interpret into user-friendly categories
-        result: dict[str, Any] = {"celeba_raw": raw_attrs}
-        # Hair color (pick highest confidence)
-        hair_colors = {
-            "black": raw_attrs.get("Black_Hair", 0),
-            "brown": raw_attrs.get("Brown_Hair", 0),
-            "blonde": raw_attrs.get("Blond_Hair", 0),
-            "gray": raw_attrs.get("Gray_Hair", 0),
-        }
-        result["hair_color_celeba"] = max(hair_colors, key=hair_colors.get)
-        # Hair type
-        if raw_attrs.get("Straight_Hair", 0) > 0.5:
-            result["hair_type_celeba"] = "straight"
-        elif raw_attrs.get("Wavy_Hair", 0) > 0.5:
-            result["hair_type_celeba"] = "wavy"
-        else:
-            result["hair_type_celeba"] = "unknown"
-        result["bald"] = raw_attrs.get("Bald", 0) > 0.5
-        result["bangs"] = raw_attrs.get("Bangs", 0) > 0.5
-        result["receding_hairline"] = raw_attrs.get("Receding_Hairline", 0) > 0.5
-        # Facial hair
-        has_beard = raw_attrs.get("No_Beard", 0) < 0.5
-        has_goatee = raw_attrs.get("Goatee", 0) > 0.5
-        has_mustache = raw_attrs.get("Mustache", 0) > 0.5
-        has_sideburns = raw_attrs.get("Sideburns", 0) > 0.5
-        has_stubble = raw_attrs.get("5_o_Clock_Shadow", 0) > 0.5
-        if has_goatee:
-            result["facial_hair"] = "goatee"
-        elif has_mustache and has_beard:
-            result["facial_hair"] = "full_beard"
-        elif has_mustache:
-            result["facial_hair"] = "mustache"
-        elif has_sideburns:
-            result["facial_hair"] = "sideburns"
-        elif has_stubble:
-            result["facial_hair"] = "stubble"
-        elif not has_beard:
-            result["facial_hair"] = "clean_shaven"
-        else:
-            result["facial_hair"] = "beard"
-        # Appearance attributes
-        result["wearing_glasses"] = raw_attrs.get("Eyeglasses", 0) > 0.5
-        result["wearing_hat"] = raw_attrs.get("Wearing_Hat", 0) > 0.5
-        result["bushy_eyebrows"] = raw_attrs.get("Bushy_Eyebrows", 0) > 0.5
-        result["arched_eyebrows_celeba"] = raw_attrs.get("Arched_Eyebrows", 0) > 0.5
-        result["bags_under_eyes"] = raw_attrs.get("Bags_Under_Eyes", 0) > 0.5
-        result["high_cheekbones_celeba"] = raw_attrs.get("High_Cheekbones", 0) > 0.5
-        result["oval_face_celeba"] = raw_attrs.get("Oval_Face", 0) > 0.5
-        result["pointy_nose_celeba"] = raw_attrs.get("Pointy_Nose", 0) > 0.5
-        result["big_lips_celeba"] = raw_attrs.get("Big_Lips", 0) > 0.5
-        result["big_nose_celeba"] = raw_attrs.get("Big_Nose", 0) > 0.5
-        result["narrow_eyes_celeba"] = raw_attrs.get("Narrow_Eyes", 0) > 0.5
-        result["double_chin"] = raw_attrs.get("Double_Chin", 0) > 0.5
-        result["chubby"] = raw_attrs.get("Chubby", 0) > 0.5
-        result["rosy_cheeks"] = raw_attrs.get("Rosy_Cheeks", 0) > 0.5
-        result["pale_skin"] = raw_attrs.get("Pale_Skin", 0) > 0.5
-        result["young"] = raw_attrs.get("Young", 0) > 0.5
-        result["smiling_celeba"] = raw_attrs.get("Smiling", 0) > 0.5
-        result["mouth_open"] = raw_attrs.get("Mouth_Slightly_Open", 0) > 0.5
-        return result
-```
-#### face-service/analyzers/parsing_analyzer.py
-```python
-"""
-BiSeNet Face Parsing — 19-class semantic segmentation of the face.
-Segments: skin, eyebrows, eyes, nose, lips, hair, ears, neck, etc.
-"""
-import os
-from typing import Any
-import cv2
-import numpy as np
-import torch
-from torchvision import transforms
-class ParsingAnalyzer:
-    """
-    BiSeNet face parsing for hair/skin/feature segmentation.
-    Parsing classes:
-    0: background, 1: skin, 2: l_brow, 3: r_brow, 4: l_eye, 5: r_eye,
-    6: eye_g (glasses), 7: l_ear, 8: r_ear, 9: ear_r (earring),
-    10: nose, 11: mouth, 12: u_lip, 13: l_lip, 14: neck,
-    15: necklace, 16: cloth, 17: hair, 18: hat
-    """
-    LABELS = {
-        0: "background", 1: "skin", 2: "left_brow", 3: "right_brow",
-        4: "left_eye", 5: "right_eye", 6: "glasses", 7: "left_ear",
-        8: "right_ear", 9: "earring", 10: "nose", 11: "mouth",
-        12: "upper_lip", 13: "lower_lip", 14: "neck", 15: "necklace",
-        16: "cloth", 17: "hair", 18: "hat",
-    }
-    def __init__(self):
-        self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
-        self.model = self._load_model()
-        self.transform = transforms.Compose([
-            transforms.ToTensor(),
-            transforms.Normalize(
-                mean=[0.485, 0.456, 0.406],
-                std=[0.229, 0.224, 0.225],
-            ),
-        ])
-    def _load_model(self):
-        model_path = "models/bisenet_face_parsing.pt"
-        if not os.path.exists(model_path):
-            os.makedirs("models", exist_ok=True)
-            # BiSeNet model from face-parsing.PyTorch
-            # Download from: https://drive.google.com/file/d/154JgKpzCPW82qINcVieuPH3fZ2e0P812
-            raise FileNotFoundError(
-                "Please download BiSeNet face parsing weights from "
-                "https://github.com/zllrunning/face-parsing.PyTorch and place at "
-                "models/bisenet_face_parsing.pt"
-            )
-        from models.bisenet_model import BiSeNet  # You'll need to include this
-        model = BiSeNet(n_classes=19)
-        model.load_state_dict(
-            torch.load(model_path, map_location=self.device)
-        )
-        model.to(self.device)
-        model.eval()
-        return model
-    def analyze(self, img_bgr: np.ndarray) -> dict[str, Any]:
-        h, w = img_bgr.shape[:2]
-        img_rgb = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)
-        img_resized = cv2.resize(img_rgb, (512, 512))
-        input_tensor = self.transform(img_resized).unsqueeze(0).to(self.device)
-        with torch.no_grad():
-            output = self.model(input_tensor)[0]  # BiSeNet returns tuple
-        parsing = output.squeeze(0).argmax(0).cpu().numpy()
-        parsing = cv2.resize(
-            parsing.astype(np.uint8), (w, h), interpolation=cv2.INTER_NEAREST
-        )
-        # Generate masks
-        skin_mask = (parsing == 1).astype(np.uint8)
-        hair_mask = (parsing == 17).astype(np.uint8)
-        glasses_mask = (parsing == 6).astype(np.uint8)
-        hat_mask = (parsing == 18).astype(np.uint8)
-        # Facial hair detection: look for dark pixels in lower face skin region
-        lower_face = parsing[int(h * 0.55):int(h * 0.85), int(w * 0.25):int(w * 0.75)]
-        lower_skin = (lower_face == 1).sum()
-        total_lower = lower_face.size or 1
-        # Region stats
-        hair_area = hair_mask.sum() / (h * w)
-        skin_area = skin_mask.sum() / (h * w)
-        result: dict[str, Any] = {
-            "_skin_mask": skin_mask,
-            "_hair_mask": hair_mask,
-            "has_glasses_parsing": int(glasses_mask.sum()) > 100,
-            "wearing_hat_parsing": int(hat_mask.sum()) > 500,
-            "hair_coverage": round(float(hair_area), 3),
-            "skin_coverage": round(float(skin_area), 3),
-        }
-        # Hair length estimation from mask
-        if hair_area < 0.01:
-            result["hair_length_estimate"] = "bald"
-        elif hair_area < 0.08:
-            result["hair_length_estimate"] = "short"
-        elif hair_area < 0.18:
-            result["hair_length_estimate"] = "medium"
-        else:
-            result["hair_length_estimate"] = "long"
-        # Wrinkle analysis on forehead skin
-        forehead_region = img_bgr[int(h * 0.05):int(h * 0.25), int(w * 0.3):int(w * 0.7)]
-        forehead_skin = skin_mask[int(h * 0.05):int(h * 0.25), int(w * 0.3):int(w * 0.7)]
-        if forehead_skin.sum() > 100:
-            gray_forehead = cv2.cvtColor(forehead_region, cv2.COLOR_BGR2GRAY)
-            # Apply mask
-            gray_forehead = cv2.bitwise_and(gray_forehead, gray_forehead, mask=forehead_skin)
-            edges = cv2.Canny(gray_forehead, 30, 80)
-            edge_density = edges.sum() / (forehead_skin.sum() * 255 + 1)
-            result["forehead_wrinkle_score"] = round(float(edge_density), 3)
-            result["forehead_wrinkles"] = (
-                "heavy" if edge_density > 0.15
-                else "moderate" if edge_density > 0.08
-                else "mild" if edge_density > 0.04
-                else "none"
-            )
-        # Freckles/moles detection on skin
-        skin_region = cv2.bitwise_and(img_bgr, img_bgr, mask=skin_mask)
-        gray_skin = cv2.cvtColor(skin_region, cv2.COLOR_BGR2GRAY)
-        # Detect dark spots
-        _, dark_spots = cv2.threshold(gray_skin, 80, 255, cv2.THRESH_BINARY_INV)
-        dark_spots = cv2.bitwise_and(dark_spots, dark_spots, mask=skin_mask)
-        # Find contours of dark spots
-        contours, _ = cv2.findContours(dark_spots, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
-        small_spots = [c for c in contours if 5 < cv2.contourArea(c) < 200]
-        result["possible_freckles_moles"] = len(small_spots) > 10
-        result["dark_spot_count"] = len(small_spots)
-        return result
-```
-#### face-service/analyzers/emotion_analyzer.py
-```python
-"""
-HSEmotion — State-of-the-art facial emotion recognition.
-Supports 8 emotions on AffectNet.
-"""
-import os
-from typing import Any
-import cv2
-import numpy as np
-import torch
-import torchvision.transforms as transforms
-from PIL import Image
-class EmotionAnalyzer:
-    """HSEmotion-based facial expression classifier."""
-    EMOTION_LABELS = [
-        "angry", "contempt", "disgust", "fear",
-        "happy", "neutral", "sad", "surprise",
-    ]
-    def __init__(self):
-        self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
-        self.model = self._load_model()
-        self.transform = transforms.Compose([
-            transforms.Resize((260, 260)),
-            transforms.CenterCrop(224),
-            transforms.ToTensor(),
-            transforms.Normalize(
-                mean=[0.485, 0.456, 0.406],
-                std=[0.229, 0.224, 0.225],
-            ),
-        ])
-    def _load_model(self):
-        """Load HSEmotion EfficientNet model."""
-        model_path = "models/hsemotion_enet_b0_8.pt"
-        if not os.path.exists(model_path):
-            os.makedirs("models", exist_ok=True)
-            try:
-                from huggingface_hub import hf_hub_download
-                # HSEmotion models available at:
-                # https://github.com/HSE-asavchenko/face-emotion-recognition
-                hf_hub_download(
-                    repo_id="HSE-asavchenko/hsemotion",
-                    filename="enet_b0_8_best_afew.pt",
-                    local_dir="models",
-                    local_dir_use_symlinks=False,
-                )
-                os.rename("models/enet_b0_8_best_afew.pt", model_path)
-            except Exception:
-                raise FileNotFoundError(
-                    "Please download HSEmotion weights from "
-                    "https://github.com/HSE-asavchenko/face-emotion-recognition"
-                )
-        import timm
-        model = timm.create_model("efficientnet_b0", pretrained=False, num_classes=8)
-        model.load_state_dict(torch.load(model_path, map_location=self.device))
-        model.to(self.device)
-        model.eval()
-        return model
-    def analyze(self, img_rgb: np.ndarray) -> dict[str, Any]:
-        pil_image = Image.fromarray(img_rgb)
-        input_tensor = self.transform(pil_image).unsqueeze(0).to(self.device)
-        with torch.no_grad():
-            logits = self.model(input_tensor)
-        probs = torch.softmax(logits, dim=1).cpu().numpy()[0]
-        top_idx = int(np.argmax(probs))
-        return {
-            "emotion": self.EMOTION_LABELS[top_idx],
-            "emotion_confidence": round(float(probs[top_idx]), 3),
-            "emotion_probabilities": {
-                label: round(float(prob), 3)
-                for label, prob in zip(self.EMOTION_LABELS, probs)
-            },
-        }
-```
-#### face-service/analyzers/color_analyzer.py
-```python
-"""
-Pixel-level color analysis using segmentation masks from BiSeNet
-and landmark positions from MediaPipe.
-"""
-from typing import Any, Optional
-import cv2
-import numpy as np
-from sklearn.cluster import KMeans
-class ColorAnalyzer:
-    """Analyzes skin tone, eye color, and hair color from pixel data."""
-    def analyze(
-        self,
-        img_rgb: np.ndarray,
-        skin_mask: Optional[np.ndarray] = None,
-        hair_mask: Optional[np.ndarray] = None,
-        landmark_data: Optional[list[dict]] = None,
-    ) -> dict[str, Any]:
-        h, w = img_rgb.shape[:2]
-        results: dict[str, Any] = {}
-        # === Skin Tone ===
-        if skin_mask is not None and skin_mask.sum() > 100:
-            skin_pixels = img_rgb[skin_mask > 0]
-            # Convert to LAB for perceptually uniform brightness
-            skin_lab = cv2.cvtColor(
-                skin_pixels.reshape(-1, 1, 3), cv2.COLOR_RGB2LAB
-            ).reshape(-1, 3)
-            avg_l = float(skin_lab[:, 0].mean())  # L channel (brightness)
-            if avg_l > 180:
-                results["skin_tone"] = "very_light"
-            elif avg_l > 155:
-                results["skin_tone"] = "light"
-            elif avg_l > 130:
-                results["skin_tone"] = "medium_light"
-            elif avg_l > 105:
-                results["skin_tone"] = "medium"
-            elif avg_l > 80:
-                results["skin_tone"] = "medium_dark"
-            else:
-                results["skin_tone"] = "dark"
-            results["skin_tone_score"] = round(avg_l / 255, 3)
-            # Fitzpatrick scale approximation
-            if avg_l > 170:
-                results["fitzpatrick_type"] = "I"
-            elif avg_l > 145:
-                results["fitzpatrick_type"] = "II"
-            elif avg_l > 120:
-                results["fitzpatrick_type"] = "III"
-            elif avg_l > 95:
-                results["fitzpatrick_type"] = "IV"
-            elif avg_l > 70:
-                results["fitzpatrick_type"] = "V"
-            else:
-                results["fitzpatrick_type"] = "VI"
-        # === Hair Color ===
-        if hair_mask is not None and hair_mask.sum() > 500:
-            hair_pixels = img_rgb[hair_mask > 0]
-            # K-means to find dominant hair color
-            if len(hair_pixels) > 100:
-                sample_size = min(5000, len(hair_pixels))
-                indices = np.random.choice(len(hair_pixels), sample_size, replace=False)
-                sampled = hair_pixels[indices].astype(np.float32)
-                kmeans = KMeans(n_clusters=3, random_state=42, n_init=10)
-                kmeans.fit(sampled)
-                # Pick the cluster with most members
-                labels, counts = np.unique(kmeans.labels_, return_counts=True)
-                dominant_idx = labels[np.argmax(counts)]
-                dominant_color = kmeans.cluster_centers_[dominant_idx].astype(int)
-                r, g, b = dominant_color
-                brightness = (int(r) + int(g) + int(b)) / 3
-                # Classify hair color
-                hsv_color = cv2.cvtColor(
-                    np.array([[dominant_color]], dtype=np.uint8), cv2.COLOR_RGB2HSV
-                )[0][0]
-                hue, sat, val = int(hsv_color[0]), int(hsv_color[1]), int(hsv_color[2])
-                if brightness < 40:
-                    results["hair_color_detected"] = "black"
-                elif brightness > 190:
-                    results["hair_color_detected"] = "platinum_blonde"
-                elif brightness > 160 and sat < 50:
-                    results["hair_color_detected"] = "gray"
-                elif brightness > 140 and (hue > 15 and hue < 35):
-                    results["hair_color_detected"] = "blonde"
-                elif (hue < 15 or hue > 160) and sat > 80:
-                    results["hair_color_detected"] = "red"
-                elif brightness > 60:
-                    results["hair_color_detected"] = "brown"
-                else:
-                    results["hair_color_detected"] = "dark_brown"
-                results["hair_dominant_rgb"] = [int(r), int(g), int(b)]
-            # Hair texture analysis (FFT-based)
-            hair_region = cv2.bitwise_and(
-                img_rgb,
-                img_rgb,
-                mask=hair_mask,
-            )
-            gray_hair = cv2.cvtColor(hair_region, cv2.COLOR_RGB2GRAY)
-            # Mask out non-hair regions
-            gray_hair_masked = gray_hair[hair_mask > 0]
-            if len(gray_hair_masked) > 1000:
-                # Compute local variance as texture indicator
-                # High frequency = curly, low frequency = straight
-                hair_patch = gray_hair_masked[:1024].astype(np.float32)
-                fft = np.fft.fft(hair_patch)
-                magnitude = np.abs(fft)
-                # Ratio of high freq to low freq energy
-                low_freq = magnitude[:len(magnitude) // 4].sum()
-                high_freq = magnitude[len(magnitude) // 4:].sum()
-                freq_ratio = high_freq / (low_freq + 1e-6)
-                if freq_ratio > 0.8:
-                    results["hair_texture_detected"] = "curly"
-                elif freq_ratio > 0.5:
-                    results["hair_texture_detected"] = "wavy"
-                else:
-                    results["hair_texture_detected"] = "straight"
-        # === Eye Color ===
-        if landmark_data is not None and len(landmark_data) > 473:
-            for eye_name, iris_idx in [("left", 468), ("right", 473)]:
-                ix = int(landmark_data[iris_idx]["x"] * w)
-                iy = int(landmark_data[iris_idx]["y"] * h)
-                # Sample a small patch around iris
-                pad = 3
-                y1 = max(0, iy - pad)
-                y2 = min(h, iy + pad)
-                x1 = max(0, ix - pad)
-                x2 = min(w, ix + pad)
-                iris_patch = img_rgb[y1:y2, x1:x2]
-                if iris_patch.size == 0:
-                    continue
-                avg_color = iris_patch.mean(axis=(0, 1))
-                r, g, b = avg_color
-                # Convert to HSV for better classification
-                hsv = cv2.cvtColor(
-                    np.array([[avg_color]], dtype=np.uint8), cv2.COLOR_RGB2HSV
-                )[0][0]
-                hue_val, sat_val, val_val = int(hsv[0]), int(hsv[1]), int(hsv[2])
-                if val_val < 60:
-                    color = "dark_brown"
-                elif sat_val < 30:
-                    color = "gray"
-                elif hue_val > 100 and hue_val < 130 and sat_val > 50:
-                    color = "blue"
-                elif hue_val > 35 and hue_val < 85 and sat_val > 40:
-                    color = "green"
-                elif (hue_val > 15 and hue_val < 35) and sat_val > 40:
-                    color = "hazel"
-                elif val_val > 120 and sat_val > 60:
-                    color = "amber"
-                else:
-                    color = "brown"
-                results[f"{eye_name}_eye_color"] = color
-            # Consensus
-            if "left_eye_color" in results and "right_eye_color" in results:
-                if results["left_eye_color"] == results["right_eye_color"]:
-                    results["eye_color"] = results["left_eye_color"]
-                else:
-                    results["eye_color"] = results["left_eye_color"]  # Use left as primary
-                    results["heterochromia"] = True
-        return results
-```
-#### face-service/Dockerfile
-```dockerfile
-FROM python:3.11-slim
-WORKDIR /app
-# Install system dependencies for OpenCV
-RUN apt-get update && apt-get install -y \
-    libgl1-mesa-glx \
-    libglib2.0-0 \
-    curl \
-    && rm -rf /var/lib/apt/lists/*
-COPY requirements.txt .
-RUN pip install --no-cache-dir -r requirements.txt
-COPY . .
-# Download MediaPipe model at build time
-RUN python -c "from analyzers.landmark_analyzer import LandmarkAnalyzer; LandmarkAnalyzer()"
-EXPOSE 8000
-CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]
-```
-### Connect Your Next.js App to the Microservice
-#### lib/face-analysis/api-client.ts
-```typescript
-/**
- * Client for the Python face analysis microservice.
- * Replaces the Supabase Edge Function that called AWS Rekognition.
- */
-const FACE_SERVICE_URL = process.env.NEXT_PUBLIC_FACE_SERVICE_URL || "http://localhost:8000";
-export interface FaceAnalysisResult {
-  // Geometric (MediaPipe)
-  face_shape: string;
-  face_shape_metrics: Record<string, number>;
-  forehead_width: string;
-  jawline_type: string;
-  jawline_angle: number;
-  chin_type: string;
-  cheekbone_prominence: string;
-  cheek_fullness: string;
-  eye_shape: string;
-  eye_depth: string;
-  eye_spacing: string;
-  eye_size: string;
-  eyebrow_shape: string;
-  eyebrow_arch_height: string;
-  eyebrow_thickness: string;
-  possible_unibrow: boolean;
-  nose_shape: string;
-  nose_bridge: string;
-  nose_tip_shape: string;
-  nostril_width: string;
-  lip_fullness: string;
-  lip_balance: string;
-  mouth_width: string;
-  cupids_bow: string;
-  smiling: boolean;
-  smile_asymmetry: number;
-  possible_dimples: boolean;
-  facial_asymmetry_score: number;
-  // Demographics (FairFace)
-  age_estimate: number;
-  age_range: string;
-  age_confidence: number;
-  gender: string;
-  gender_confidence: number;
-  race: string;
-  race_confidence: number;
-  race_probabilities: Record<string, number>;
-  // CelebA Attributes
-  facial_hair: string;
-  wearing_glasses: boolean;
-  bald: boolean;
-  receding_hairline: boolean;
-  hair_color_celeba: string;
-  hair_type_celeba: string;
-  bags_under_eyes: boolean;
-  double_chin: boolean;
-  bushy_eyebrows: boolean;
-  high_cheekbones_celeba: boolean;
-  // Emotion (HSEmotion)
-  emotion: string;
-  emotion_confidence: number;
-  emotion_probabilities: Record<string, number>;
-  // Color Analysis
-  skin_tone: string;
-  skin_tone_score: number;
-  fitzpatrick_type: string;
-  eye_color: string;
-  hair_color_detected: string;
-  hair_dominant_rgb: number[];
-  hair_texture_detected: string;
-  // Parsing
-  hair_length_estimate: string;
-  forehead_wrinkles: string;
-  possible_freckles_moles: boolean;
-  dark_spot_count: number;
-  // Blendshapes
-  blendshapes: Record<string, number>;
-}
-export async function analyzeFace(imageFile: File): Promise<FaceAnalysisResult> {
-  const formData = new FormData();
-  formData.append("file", imageFile);
-  const response = await fetch(`${FACE_SERVICE_URL}/analyze`, {
-    method: "POST",
-    body: formData,
-  });
-  if (!response.ok) {
-    const error = await response.json().catch(() => ({ detail: "Unknown error" }));
-    throw new Error(`Face analysis failed: ${error.detail}`);
-  }
-  const result = await response.json();
-  if (!result.success) {
-    throw new Error("Face analysis returned unsuccessful result");
-  }
-  return result.data;
-}
-export async function checkServiceHealth(): Promise<boolean> {
-  try {
-    const response = await fetch(`${FACE_SERVICE_URL}/health`);
-    return response.ok;
-  } catch {
-    return false;
-  }
-}
-```
-### Deploy to Hugging Face Spaces (Free)
-Create a `README.md` in the `face-service/` directory with the following frontmatter:
-```yaml
----
-title: HCP Face Analysis
-emoji: 🔍
-colorFrom: blue
-colorTo: purple
-sdk: docker
-app_port: 8000
----
-```
----
-## Final Architecture Summary
-```
-Browser (Next.js)
-    │
-    │  POST /analyze (image file)
-    ▼
-Hugging Face Spaces (FREE, 2GB RAM)
-    ├── FastAPI Server
-    ├── MediaPipe (4MB) ──────► 478 landmarks → ~40 geometric features
-    ├── FairFace (90MB) ──────► age, gender, race
-    ├── CelebA ResNet (44MB) ─► 40 binary attributes (hair, beard, glasses...)
-    ├── BiSeNet (50MB) ───────► face parsing → hair/skin segmentation
-    ├── HSEmotion (20MB) ─────► 8 emotions
-    └── Color Analysis ───────► skin tone, eye color, hair color
-    │
-    │  JSON response (~150 attributes)
-    ▼
-Supabase (existing)
-    ├── Store results in PostgreSQL
-    └── Auth / Storage unchanged
-```
-| Metric | Value |
-|--------|-------|
-| **Total models** | ~210MB |
-| **Features detected** | **~95% of the full feature list** |
-| **Hosting cost** | **$0** (HF Spaces free tier) |
-| **Latency** | ~2-4s per image (CPU) |
-| **Languages** | Python (microservice) + TypeScript (existing Next.js) |
-| **Only missing** | Teeth analysis, scar detection, Adam's apple (require specialized fine-tuned models) |
----
-## Required Feature List
-### Face shape
-- Oval face, Round face, Square face, Heart-shaped face, Diamond face, Long/oblong face, Triangle face
-- Jawline sharp, Jawline soft, Strong jaw, Receding chin, Pointed chin, Cleft chin, Wide chin
-- High cheekbones, Flat cheekbones, Full cheeks, Hollow cheeks
-- Broad forehead, Narrow forehead
-### Eye shape
-- Almond, Round, Hooded, Monolid, Deep-set eyes, Protruding eyes
-- Upturned eyes, Downturned eyes, Wide-set eyes, Close-set eyes, Large eyes, Small eyes
-- Eye color: brown, blue, green, hazel
-- Dark under-eyes, Eye bags, Crow's feet
-### Eyebrows
-- Thick, Thin, Arched, Straight, Bushy, Unibrow
-- High eyebrow arch, Low eyebrow arch
-### Nose
-- Straight, Aquiline, Button, Upturned, Wide, Narrow
-- Flat bridge, High bridge, Wide nostrils, Narrow nostrils
-- Rounded tip, Pointed tip
-### Lips & Mouth
-- Full, Thin, Wide mouth, Small mouth
-- Defined cupid's bow, Uneven lips
-- Gap teeth, Crooked teeth, Straight teeth, Overbite, Underbite
-- Dimples, Smile lines, Asymmetrical smile
-### Hair
-- Straight, Wavy, Curly, Coily
-- Short, Long, Bald, Receding hairline, Widow's peak
-- Thick, Thin
-- Color: black, brown, blonde, red, gray, dyed
-### Facial hair
-- Full beard, Stubble, Goatee, Mustache, Clean-shaven, Sideburns
-### Skin & Other
-- Skin tone: light, medium, dark
-- Freckles, Moles, Birthmark, Scar, Acne
-- Wrinkles, Forehead lines, Smile lines
-- Facial asymmetry, Prominent Adam's apple

+# HCP Face Analysis — Architecture
+## Pipeline
+A single photo is fed through seven analyzers. Their outputs are merged
+into one dictionary; later analyzers overwrite any colliding keys from
+earlier ones.
+```
+Photo (RGB ndarray)
+  │
+  ├─► [1] MediaPipe Face Landmarker
+  │       478 landmarks + 52 blendshapes
+  │       → all geometric features (face/eye/nose/eyebrow/lip/jaw shape),
+  │         smiling (mouthSmile blendshapes), eyes_open, possible_dimples,
+  │         possible_unibrow, facial_asymmetry_score, blendshapes dict
+  │
+  ├─► [2] FairFace + Ethnicity ViT (DemographicAnalyzer)
+  │       → age_range, age_estimate (softmax-weighted continuous), age_confidence,
+  │         gender + confidence, ethnicity + confidence, full distributions
+  │
+  ├─► [3] SegFormer-B5 human parsing (ParsingAnalyzer)
+  │       → per-class pixel masks (face, hair, hat, …)
+  │       → hair_length, hair_present, hat_detected,
+  │         wrinkle_level, skin_texture_score, skin_uniformity, freckles_or_moles
+  │       (uses OpenCV stats over the SegFormer face mask for the skin rows)
+  │
+  ├─► [4] HSEmotion EfficientNet-B0 (EmotionAnalyzer)
+  │       → primary/secondary emotion, emotion_scores (8 classes),
+  │         valence, arousal, mood
+  │
+  ├─► [5] ColorAnalyzer (no ML — OpenCV LAB/HSV)
+  │       inputs: SegFormer skin/hair masks + MediaPipe landmarks
+  │       → skin_tone (Fitzpatrick + L*/a*/b* + hex), skin_undertone,
+  │         eye_color, hair_color (name + hex), hair_texture (pixel-Laplacian, coarse),
+  │         lip_color (shade + hex)  ← lip mask built from MediaPipe outer-minus-inner lip
+  │
+  ├─► [6] ObstructionViT — dima806/face_obstruction_image_detection
+  │       → wearing_glasses, wearing_sunglasses, wearing_mask,
+  │         obstruction_top, obstruction_scores
+  │
+  └─► [7] HairTypeViT — dima806/hair_type_image_detection
+          → hair_type (curly/dreadlocks/kinky/straight/wavy),
+            hair_type_confidence, hair_type_scores
+```
+All masks and other internal fields use a leading underscore in the key
+(e.g. `_skin_mask`). `app.py` strips those before returning JSON so the
+client never sees them.
+## Attribute → source map
+The EditProfileScreen renders only fields backed by one of these
+analyzers. Anything previously fed by the FaRL zero-shot classifier
+has been removed because its outputs were too noisy to trust.
+| Section | Field(s) | Source |
+|---|---|---|
+| Demographics | gender, age (continuous), age_range, ethnicity, distributions | FairFace + Ethnicity ViT |
+| Emotion | primary/secondary emotion, scores, valence, arousal, mood | HSEmotion |
+| Face Structure | face_shape (+ 4 ratios), jawline_type/angle, chin_type, cheekbone_prominence, cheek_fullness, forehead_width, facial_asymmetry_score | MediaPipe |
+| Hair | hair_length, hair_present | SegFormer |
+| Hair | hair_type (+ confidence) | HairTypeViT |
+| Hair | hair_color, hair hex | ColorAnalyzer |
+| Eyes | eye_shape, eye_depth, eye_spacing, eye_size, eyes_open | MediaPipe |
+| Eyes | eye_color | ColorAnalyzer |
+| Eyebrows | eyebrow_shape, eyebrow_arch_height, eyebrow_thickness, possible_unibrow | MediaPipe |
+| Nose | nose_shape, nose_bridge, nose_tip_shape, nostril_width | MediaPipe |
+| Lips & Mouth | lip_fullness, lip_balance, mouth_width, cupids_bow, smile_asymmetry, possible_dimples, smiling, mouth_open | MediaPipe (last two via blendshapes) |
+| Lips & Mouth | lip_color (shade + hex) | ColorAnalyzer (mask from MediaPipe) |
+| Skin | skin_tone (Fitzpatrick, L*/a*/b*, hex), skin_undertone | ColorAnalyzer |
+| Skin | wrinkle_level, skin_texture_score, skin_uniformity, freckles_or_moles | SegFormer mask + OpenCV stats |
+| Accessories | wearing_glasses, wearing_sunglasses, wearing_mask | ObstructionViT |
+| Accessories | wearing_hat | SegFormer (hat class coverage) |
+## Deployment
+The service is built as a Docker image targeting Hugging Face Spaces
+free tier (2GB RAM, shared CPU). The MediaPipe `.task` is pulled at
+build time; all Hugging Face models lazy-download on first inference
+and cache under `/root/.cache/huggingface` inside the container.
+The Node/Express server forwards `/analyze-face` requests to
+`FACE_SERVICE_URL/analyze-base64`. The React Native client never talks
+to this service directly.
+## Adding a new analyzer
+1. Drop a new module under `analyzers/` exposing a class with
+   `__init__()` and `analyze(img_rgb) -> dict`.
+2. Import it in `app.py`, add a global slot and a lazy-load block in
+   `get_analyzers()`, and append a `results.update(...)` call to both
+   `/analyze` and `/analyze-base64`.
+3. Surface the new keys in `client/src/screens/EditProfileScreen.js`
+   and add a legend row in the "Analysis Method Details" section.
+Order matters: later analyzers overwrite earlier keys on collision.
+The specialized ViT classifiers run last so they win over any coarser
+signal.

requirements.txt CHANGED Viewed

@@ -13,6 +13,3 @@ timm==1.0.3
 safetensors>=0.6.0
 transformers==4.45.2
 hsemotion>=0.2.2
-openai-clip==1.0.1
-ftfy
-regex

 safetensors>=0.6.0
 transformers==4.45.2
 hsemotion>=0.2.2