Spaces:

FocusGuard
/

IntegrationTest

Sleeping

App Files Files Community

IntegrationTest / others /README.md

Yingtao-Zheng

Add general README into folder /others

fad97ce about 2 months ago

preview code

raw

history blame contribute delete

3.2 kB

	# FocusGuard

	Real-time webcam-based focus detection system combining geometric feature extraction with machine learning classification. The pipeline extracts 17 facial features (EAR, gaze, head pose, PERCLOS, blink rate, etc.) from MediaPipe landmarks and classifies attentiveness using MLP and XGBoost models. Served via a React + FastAPI web application with live WebSocket video.

	## 1. Project Structure

	```
	├── data/ Raw collected sessions (collected_<name>/*.npz)
	├── data_preparation/ Data loading, cleaning, and exploration
	├── notebooks/ Training notebooks (MLP, XGBoost) with LOPO evaluation
	├── models/ Feature extraction modules and training scripts
	├── checkpoints/ All saved weights (mlp_best.pt, xgboost_*_best.json, GRU, scalers)
	├── evaluation/ Training logs and metrics (JSON)
	├── ui/ Live OpenCV demo and inference pipeline
	├── src/ React/Vite frontend source
	├── static/ Built frontend (served by FastAPI)
	├── app.py / main.py FastAPI backend (API, WebSocket, DB)
	├── requirements.txt Python dependencies
	└── package.json Frontend dependencies
	```

	## 2. Setup

	```bash
	python -m venv venv
	source venv/bin/activate
	pip install -r requirements.txt
	```

	Frontend (only needed if modifying the React app):

	```bash
	npm install
	npm run build
	cp -r dist/* static/
	```

	## 3. Running

	Web application (API + frontend):

	```bash
	uvicorn app:app --host 0.0.0.0 --port 7860
	```

	Open http://localhost:7860 in a browser.

	Live camera demo (OpenCV):

	```bash
	python ui/live_demo.py
	python ui/live_demo.py --xgb # XGBoost mode
	```

	Training:

	```bash
	python -m models.mlp.train # MLP
	python -m models.xgboost.train # XGBoost
	```

	## 4. Dataset

	- 9 participants, each recorded via webcam with real-time labelling (focused / unfocused)
	- 144,793 total samples, 10 selected features, binary classification
	- Collected using `python -m models.collect_features --name <name>`
	- Stored as `.npz` files in `data/collected_<name>/`

	## 5. Models

	\| Model \| Test Accuracy \| Test F1 \| ROC-AUC \|
	\|-------\|--------------\|---------\|---------\|
	\| XGBoost (600 trees, depth 8, lr 0.149) \| 95.87% \| 0.959 \| 0.991 \|
	\| MLP (64→32, 30 epochs, lr 1e-3) \| 92.92% \| 0.929 \| 0.971 \|

	Both evaluated on a held-out 15% stratified test split. LOPO (Leave-One-Person-Out) cross-validation available in `notebooks/`.

	## 6. Feature Pipeline

	1. Face mesh — MediaPipe 478-landmark detection
	2. Head pose — solvePnP → yaw, pitch, roll, face score, gaze offset, head deviation
	3. Eye scorer — EAR (left/right/avg), horizontal/vertical gaze ratio, MAR
	4. Temporal tracking — PERCLOS, blink rate, closure duration, yawn duration
	5. Classification — 10-feature vector → MLP or XGBoost → focused / unfocused

	## 7. Tech Stack

	- Backend: Python, FastAPI, WebSocket, aiosqlite
	- Frontend: React, Vite, TypeScript
	- ML: PyTorch (MLP), XGBoost, scikit-learn
	- Vision: MediaPipe, OpenCV