AI Face Detector β Complete Documentation
Overview
This project builds a real-time AI face detection system using deep learning. The detector works on images, videos, webcam streams, and can be deployed as an API or web app.
Core features:
- Detect faces in images, video, and live camera
- Draw bounding boxes with confidence score
- Detect multiple faces simultaneously
- GPU acceleration support
- REST API ready
- Easy deployment (local / cloud)
Tech Stack
- Python 3.10+
- OpenCV
- PyTorch
- TorchVision
- NumPy
- FastAPI (optional for API)
- Streamlit (optional UI)
Model used:
- RetinaFace or YOLOv8-Face (recommended modern choice)
Project Structure
ai-face-detector/
β
βββ models/
β βββ yolov8n-face.pt
β
βββ src/
β βββ detector.py
β βββ webcam.py
β βββ image_infer.py
β βββ video_infer.py
β βββ api.py
β
βββ requirements.txt
βββ README.md
Installation
Create environment:
python -m venv venv
venv\Scripts\activate (Windows)
source venv/bin/activate (Linux/Mac)
Install dependencies:
pip install ultralytics opencv-python numpy fastapi uvicorn pillow
Download pretrained face model:
yolo task=detect model=yolov8n.pt
Then download face-trained weights:
https://github.com/akanametov/yolo-face/releases
Place model inside /models.
Core Face Detection Engine
Create detector.py
from ultralytics import YOLO
import cv2
class FaceDetector:
def __init__(self, model_path="models/yolov8n-face.pt"):
self.model = YOLO(model_path)
def detect(self, frame):
results = self.model(frame, conf=0.4)[0]
faces = []
for box in results.boxes:
x1, y1, x2, y2 = map(int, box.xyxy[0])
conf = float(box.conf[0])
faces.append((x1, y1, x2, y2, conf))
return faces
def draw_faces(self, frame, faces):
for (x1, y1, x2, y2, conf) in faces:
cv2.rectangle(frame,(x1,y1),(x2,y2),(0,255,0),2)
cv2.putText(frame,f"{conf:.2f}",(x1,y1-5),
cv2.FONT_HERSHEY_SIMPLEX,0.6,(0,255,0),2)
return frame
Image Detection Script
image_infer.py
import cv2
from detector import FaceDetector
detector = FaceDetector()
img = cv2.imread("test.jpg")
faces = detector.detect(img)
output = detector.draw_faces(img, faces)
cv2.imshow("Faces", output)
cv2.waitKey(0)
Run:
python src/image_infer.py
Webcam Real-Time Detection
webcam.py
import cv2
from detector import FaceDetector
detector = FaceDetector()
cap = cv2.VideoCapture(0)
while True:
ret, frame = cap.read()
if not ret:
break
faces = detector.detect(frame)
frame = detector.draw_faces(frame, faces)
cv2.imshow("AI Face Detector", frame)
if cv2.waitKey(1) & 0xFF == 27:
break
cap.release()
cv2.destroyAllWindows()
Run:
python src/webcam.py
Video File Detection
video_infer.py
import cv2
from detector import FaceDetector
detector = FaceDetector()
cap = cv2.VideoCapture("video.mp4")
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
faces = detector.detect(frame)
frame = detector.draw_faces(frame, faces)
cv2.imshow("Video Face Detection", frame)
if cv2.waitKey(1) & 0xFF == 27:
break
cap.release()
cv2.destroyAllWindows()
Build REST API
api.py
from fastapi import FastAPI, UploadFile
import cv2
import numpy as np
from detector import FaceDetector
app = FastAPI()
detector = FaceDetector()
@app.post("/detect")
async def detect_faces(file: UploadFile):
image_bytes = await file.read()
np_arr = np.frombuffer(image_bytes, np.uint8)
img = cv2.imdecode(np_arr, cv2.IMREAD_COLOR)
faces = detector.detect(img)
return {"faces": faces}
Run server:
uvicorn src.api:app --reload
Test endpoint:
POST http://127.0.0.1:8000/detect
Performance Optimization
GPU acceleration:
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
Batch processing:
- Resize images to 640x640
- Use half precision:
self.model = YOLO(model_path).to("cuda").half()
Expected FPS:
- CPU: 10β20 FPS
- GPU: 60β120 FPS
Possible Extensions
Face recognition (identity matching) Emotion detection Age & gender prediction Face tracking with DeepSORT Attendance system Security surveillance system
Troubleshooting
Camera not opening:
cv2.VideoCapture(0, cv2.CAP_DSHOW)
Low FPS:
- Reduce resolution
- Enable GPU
- Use smaller model (nano version)
License
Open-source for research and educational use.