# Comparative Analysis: Biometric Models for Face Attendance This document evaluates the biometric models currently used in the [SecureAttend AI](file:///c:/Users/katiy/Downloads/FaceDetection/README.md) system versus the alternative methods mentioned in your request, focusing on their suitability for lightweight CPU-only hardware. --- ## 1. Executive Summary The current system implements **YuNet** (for detection) and **SFace** (for recognition), which are OpenCV's official DNN-optimized models. **None of the alternative models mentioned in your request are currently in use.** A live benchmark executed on your local machine shows that the current YuNet + SFace stack delivers exceptional real-time performance on a standard CPU: * **Face Detection (YuNet):** **19.53 ms** per frame (~51.2 FPS) * **Face Recognition (SFace):** **11.50 ms** per face crop (~87.0 FPS) * **Embedding Comparison:** **0.0079 ms** per match Based on these results and architectural trade-offs, **retaining the current YuNet + SFace stack is the highly recommended path**, as it requires zero extra library dependencies and operates far below the 33ms real-time latency threshold. --- ## 2. Baseline Benchmark Results (Local CPU) We ran a diagnostic benchmark (`benchmark_current.py`) directly on your machine's CPU to measure the baseline speed of the current implementation: | Pipeline Step | Latency (ms) | Throughput (FPS) | Resource Footprint | | :--- | :--- | :--- | :--- | | **YuNet Face Detection** | 19.53 ms | 51.2 FPS | **232 KB** model file | | **SFace Feature Extraction** | 11.50 ms | 87.0 FPS | **38.6 MB** model file | | **Cosine Similarity Match** | 0.0079 ms | 126,500 matches/sec | Virtually zero | > [!NOTE] > These numbers show that the core AI processing (detection + recognition) takes **~31 ms** combined. This fits within a single frame budget (33.3 ms for 30 FPS) even without frame-skipping. --- ## 3. Face Detection Model Comparison The table below compares the active detector (**YuNet**) against the requested alternatives: | Model | Size | CPU Latency (640x480) | Landmarks | Library Dependency | Architectural Suitability | | :--- | :--- | :--- | :--- | :--- | :--- | | **YuNet** *(Current)* | **~232 KB** | **~19.5 ms** | ✅ Yes (5 points) | None (OpenCV Native) | **Excellent (Recommended).** Extremely fast, lightweight, and specifically designed for real-time edge CPU workloads. | | **Qualcomm LWFD** | ~3.4 MB | ~30 - 50 ms | ❌ No | Qualcomm QNN / AI Hub | **Poor.** Vendor-locked. Highly optimized for Snapdragon NPUs/DSPs, but runs slower on generic x86/ARM CPUs. Lacks landmarks. | | **BlazeFace** | ~100-200 KB | ~10 - 15 ms | ✅ Yes (6 points) | Google MediaPipe | **Moderate.** Excellent for close-up phone/selfie range, but suffers from low accuracy for distant or multi-person detections. | | **RetinaFace** | ~1.7 MB (MobileNet) to ~104 MB (ResNet) | ~60 - 500+ ms | ✅ Yes (5 points) | PyTorch / InsightFace | **Poor.** High-accuracy powerhouse, but far too heavy for real-time CPU deployment. Leads to laggy frame rates. | | **YOLOv8-face** | ~6 MB (nano) | ~40 - 80 ms | ✅ Yes (5 points) | PyTorch / Ultralytics | **Moderate.** Strong multi-face detection, but carries a heavy PyTorch dependency and higher CPU latency. | ### Why YuNet is best for our case: 1. **Landmarks & Alignment:** YuNet outputs 5 facial landmarks natively, which SFace requires to crop and align the face. Without landmarks, we cannot feed aligned inputs to the recognition engine. 2. **Zero Overhead:** Being compiled directly inside the OpenCV C++ core DNN module (`cv2.FaceDetectorYN`), it avoids launching heavy Python interpreters (like PyTorch or TensorFlow) which consume large amounts of RAM. --- ## 4. Face Recognition Model Comparison The table below compares the active vectorizer (**SFace**) against the requested alternatives: | Model | Size | CPU Latency | Embedding Dim | Library Dependency | Threshold Metric | Architectural Suitability | | :--- | :--- | :--- | :--- | :--- | :--- | :--- | | **SFace** *(Current)* | **~38.6 MB** | **~11.5 ms** | 128-d | None (OpenCV Native) | Cosine (0.363) | **Excellent (Recommended).** Tailored for 112x112 YuNet crops. High speed-to-accuracy ratio. | | **MobileFaceNet** | ~4.0 MB | ~15 - 25 ms | 128-d | ONNX Runtime / TF Lite | Cosine (0.40) | **Good.** Extreme storage efficiency. Great if disk/memory space is highly constrained (e.g. microcontrollers). | | **ArcFace (R50)** | ~100+ MB | ~120 - 200 ms | 512-d | PyTorch / InsightFace | Cosine (0.65) | **Poor (for CPU).** SOTA accuracy, but the ResNet-50 backbone is too heavy for live 30 FPS CPU matching. | | **FaceNet** | ~90 MB | ~100 - 150 ms | 128/512-d | PyTorch / TensorFlow | Euclidean (1.1) | **Poor.** Legacy Inception-ResNet architecture. Highly resource-intensive and slow on CPU. | | **Dlib (ResNet-34)** | ~100+ MB | ~100 - 150 ms | 128-d | dlib (C++ build tools) | Euclidean (0.6) | **Very Poor.** Difficult to install on Windows (requires CMake and C++ compiler setup). Sluggish CPU performance. | ### Why SFace is best for our case: 1. **Perfect Integration:** It uses the same alignment crops (112x112) produced by YuNet, meaning they operate as a cohesive dual-stage pipeline inside [face_engine.py](file:///c:/Users/katiy/Downloads/FaceDetection/backend/face_engine.py). 2. **Inference Speed:** At 11.50 ms, it matches faces almost instantly, making it optimal for rapid check-in/out kiosk streams. --- ## 5. Summary Matrix & Recommendation ```mermaid graph TD A[Biometric System Architecture] --> B{Hardware Constraints} B -->|Lightweight CPU / Mini PC| C[YuNet + SFace Stack] B -->|Heavy GPU Server / SOTA Accuracy| D[RetinaFace + ArcFace R50 Stack] B -->|Ultra-low Memory <10MB RAM| E[BlazeFace + MobileFaceNet Stack] style C fill:#2ecc71,stroke:#27ae60,stroke-width:2px,color:#fff style D fill:#e74c3c,stroke:#c0392b,stroke-width:1px,color:#fff style E fill:#f39c12,stroke:#d35400,stroke-width:1px,color:#fff ``` ### Final Verdict: Keep YuNet + SFace For your local webcam attendance system running on lightweight hardware: * **Qualcomm LWFD** is rejected due to hardware lock and lack of landmark features. * **RetinaFace, YOLOv8-face, ArcFace, FaceNet, and Dlib** are rejected due to high CPU latency (>100ms) and bloated dependencies (PyTorch/TensorFlow). * **BlazeFace and MobileFaceNet** are viable alternatives if you need to run on extremely low-spec microcontrollers, but they offer lower face detection range and require introducing extra runtime libraries (MediaPipe/ONNX Runtime) which degrades the clean codebase simplicity. **Recommendation:** Retain YuNet and SFace. Instead of replacing them, apply the quick optimizations highlighted in your [ARCHITECTURE.md](file:///c:/Users/katiy/Downloads/FaceDetection/ARCHITECTURE.md#8-whats-already-done-vs-what-can-be-improved) (e.g. **Frame Skipping** and **Downscaled Detection**) to drop CPU usage by up to 50% without altering the model pipeline.