sol9x-sagar's picture
initial setup
2979822

A newer version of the Gradio SDK is available: 6.12.0

Upgrade

Limitations & Technical Notes

Anti-spoofing depends on texture analysis, so input quality matters. Notes from testing:

1. Environmental Constraints

  • Lighting matters: Fourier Transform patterns need decent lighting. Low-light or harsh backlight = noise, which causes misclassification.
  • What works: Even lighting on the face. Bright windows behind the subject = bad.

2. Input & Preprocessing Requirements

Trained on a specific pipeline, so inputs need to match.

  • 1.5x padding: Tight crops (just eyes/forehead) lose context. The padding gives enough "head space" to see 3D structure vs flat screens/paper.
  • Resolution: Resizes to 128x128, but source face should be 64x64 minimum. Upscaling tiny blurry faces loses the spoofing artifacts (screen pixels, print dots).

3. Pose & Occlusion

  • Angles: Best with frontal views (+/-30 deg yaw/pitch). Profile views drop accuracy.
  • Obstructions: Masks, hands over face, thick glasses with reflections mess with texture extraction.

4. Known Edge Cases

  • Attack types: Good at printed photos and screens. Not trained on 3D silicone masks or prosthetics.
  • Motion blur: Fast movement smears textures. For video, use temporal filtering (require 3-5 consecutive "Real" frames).

5. Security Tuning

The default threshold is balanced for general use (FPR < 2%). Security is a trade-off:

  • High-security: If you can't afford any spoofs getting through, you can bump the threshold (e.g., 0.5 → 0.8). This will also increase false rejects for real users.
  • High-convenience: If you want a smoother experience and can tolerate minor risks, lower thresholds should work fine.

Implementation Example

The 1.5x padding is handled as follows when cropping faces:

import cv2
import numpy as np

def crop_face_with_padding(image, bbox, padding_factor=1.5):
    """
    Crop face with proper padding for anti-spoofing model.
    
    Args:
        image: Input image (numpy array)
        bbox: Bounding box as (x, y, w, h)
        padding_factor: Padding multiplier (default: 1.5)
    
    Returns:
        Cropped face image
    """
    x, y, w, h = bbox
    
    # Calculate center and expanded dimensions
    center_x = x + w / 2
    center_y = y + h / 2
    max_dim = max(w, h)
    new_size = int(max_dim * padding_factor)
    
    # Calculate new bounding box
    x1 = int(center_x - new_size / 2)
    y1 = int(center_y - new_size / 2)
    x2 = x1 + new_size
    y2 = y1 + new_size
    
    # Clamp to image boundaries
    h_img, w_img = image.shape[:2]
    x1 = max(0, x1)
    y1 = max(0, y1)
    x2 = min(w_img, x2)
    y2 = min(h_img, y2)
    
    # Crop and resize to 128x128
    face_crop = image[y1:y2, x1:x2]
    face_resized = cv2.resize(face_crop, (128, 128), interpolation=cv2.INTER_LANCZOS4)
    
    return face_resized

For temporal filtering in video streams:

from collections import deque

class TemporalFilter:
    """Require N consecutive 'real' predictions before accepting."""
    
    def __init__(self, required_frames=3):
        self.required_frames = required_frames
        self.history = deque(maxlen=required_frames)
    
    def update(self, is_real: bool) -> bool:
        """Update filter and return final decision."""
        self.history.append(is_real)
        
        if len(self.history) < self.required_frames:
            return False  # Not enough frames yet
        
        return all(self.history)  # All must be 'real'