final / README.md
k22056537
evaluation: channel ablation script + feature importance LOPO
e69e3a3
|
raw
history blame
2.24 kB

FocusGuard

Webcam-based focus detection: MediaPipe face mesh β†’ 17 features (EAR, gaze, head pose, PERCLOS, etc.) β†’ MLP or XGBoost for focused/unfocused. React + FastAPI app with WebSocket video.

Project layout

β”œβ”€β”€ data/                 collected_<name>/*.npz
β”œβ”€β”€ data_preparation/     loaders, split, scale
β”œβ”€β”€ notebooks/            MLP/XGB training + LOPO
β”œβ”€β”€ models/               face_mesh, head_pose, eye_scorer, train scripts
β”œβ”€β”€ checkpoints/          mlp_best.pt, xgboost_*_best.json, scalers
β”œβ”€β”€ evaluation/           logs, plots, justify_thresholds
β”œβ”€β”€ ui/                   pipeline.py, live_demo.py
β”œβ”€β”€ src/                  React frontend
β”œβ”€β”€ static/               built frontend (after npm run build)
β”œβ”€β”€ main.py, app.py       FastAPI backend
β”œβ”€β”€ requirements.txt
└── package.json

Setup

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

To rebuild the frontend after changes:

npm install
npm run build
mkdir -p static && cp -r dist/* static/

Run

Web app: Use the venv and run uvicorn via Python so it picks up your deps (otherwise you get ModuleNotFoundError: aiosqlite):

source venv/bin/activate
python -m uvicorn main:app --host 0.0.0.0 --port 7860

Then open http://localhost:7860.

OpenCV demo:

python ui/live_demo.py
python ui/live_demo.py --xgb

Train:

python -m models.mlp.train
python -m models.xgboost.train

Data

9 participants, 144,793 samples, 10 features, binary labels. Collect with python -m models.collect_features --name <name>. Data lives in data/collected_<name>/.

Model numbers (15% test split)

Model Accuracy F1 ROC-AUC
XGBoost (600 trees, depth 8) 95.87% 0.959 0.991
MLP (64β†’32) 92.92% 0.929 0.971

Pipeline

  1. Face mesh (MediaPipe 478 pts)
  2. Head pose β†’ yaw, pitch, roll, scores, gaze offset
  3. Eye scorer β†’ EAR, gaze ratio, MAR
  4. Temporal β†’ PERCLOS, blink rate, yawn
  5. 10-d vector β†’ MLP or XGBoost β†’ focused / unfocused

Stack: FastAPI, aiosqlite, React/Vite, PyTorch, XGBoost, MediaPipe, OpenCV.