final / README.md
k22056537
evaluation: channel ablation script + feature importance LOPO
e69e3a3
|
raw
history blame
2.24 kB
# FocusGuard
Webcam-based focus detection: MediaPipe face mesh β†’ 17 features (EAR, gaze, head pose, PERCLOS, etc.) β†’ MLP or XGBoost for focused/unfocused. React + FastAPI app with WebSocket video.
## Project layout
```
β”œβ”€β”€ data/ collected_<name>/*.npz
β”œβ”€β”€ data_preparation/ loaders, split, scale
β”œβ”€β”€ notebooks/ MLP/XGB training + LOPO
β”œβ”€β”€ models/ face_mesh, head_pose, eye_scorer, train scripts
β”œβ”€β”€ checkpoints/ mlp_best.pt, xgboost_*_best.json, scalers
β”œβ”€β”€ evaluation/ logs, plots, justify_thresholds
β”œβ”€β”€ ui/ pipeline.py, live_demo.py
β”œβ”€β”€ src/ React frontend
β”œβ”€β”€ static/ built frontend (after npm run build)
β”œβ”€β”€ main.py, app.py FastAPI backend
β”œβ”€β”€ requirements.txt
└── package.json
```
## Setup
```bash
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
```
To rebuild the frontend after changes:
```bash
npm install
npm run build
mkdir -p static && cp -r dist/* static/
```
## Run
**Web app:** Use the venv and run uvicorn via Python so it picks up your deps (otherwise you get `ModuleNotFoundError: aiosqlite`):
```bash
source venv/bin/activate
python -m uvicorn main:app --host 0.0.0.0 --port 7860
```
Then open http://localhost:7860.
**OpenCV demo:**
```bash
python ui/live_demo.py
python ui/live_demo.py --xgb
```
**Train:**
```bash
python -m models.mlp.train
python -m models.xgboost.train
```
## Data
9 participants, 144,793 samples, 10 features, binary labels. Collect with `python -m models.collect_features --name <name>`. Data lives in `data/collected_<name>/`.
## Model numbers (15% test split)
| Model | Accuracy | F1 | ROC-AUC |
|-------|----------|-----|---------|
| XGBoost (600 trees, depth 8) | 95.87% | 0.959 | 0.991 |
| MLP (64β†’32) | 92.92% | 0.929 | 0.971 |
## Pipeline
1. Face mesh (MediaPipe 478 pts)
2. Head pose β†’ yaw, pitch, roll, scores, gaze offset
3. Eye scorer β†’ EAR, gaze ratio, MAR
4. Temporal β†’ PERCLOS, blink rate, yawn
5. 10-d vector β†’ MLP or XGBoost β†’ focused / unfocused
**Stack:** FastAPI, aiosqlite, React/Vite, PyTorch, XGBoost, MediaPipe, OpenCV.